Impute lets you use machine learning to replace missing values in a certain column by comparing the rows with missing values in that column and to the specified number of the most-closely related rows (the "neighbors") to predict what the missing values should be.
Impute is similar to Fill, with the following differences:
- Fill does not use machine learning.
- Fill replaces the missing values in the column while
Imputecreates a new column containing the imputed values of the target column for each row.
Impute uses a single utterance:
Impute missing values in the column <target column> using information from <grouping columns> comparing the <number> closest neighbors
Impute uses the following utterances:
target column(required). The column whose missing values you are trying to predict.
grouping columns(required). A comma-separated list of columns whose values to use to predict the missing values in the target column.
number(required). The number of neighbors to compare. This must be set to at least three.
If the missing values are successfully predicted, a success message appears in the chat history and a sample of the dataset is shown in the display panel. The dataset also has a new column that contains the predicted value of the column
To impute the missing values of a column called "Age" using the five closest neighbors in a column called "Gender," enter
Impute missing values in column Age using information from Gender comparing the 5 closest neighbors.