Skip to main content

Impute

Impute lets you use machine learning to replace missing values in a certain column by comparing the rows with missing values in that column and to the specified number of the most-closely related rows (the "neighbors") to predict what the missing values should be.

Format

Impute uses a single format: Impute missing values in the column <target column> using information from <grouping columns> comparing the <number> closest neighbors.

Parameters

Impute uses the following parameters:

  • target column (required). The column whose missing values you are trying to predict.
  • grouping columns (required). A comma-separated list of columns whose values to use to predict the missing values in the target column.
  • number (required). The number of neighbors to compare. This must be set to at least three.

Output

If the missing values are successfully predicted, a success message appears in the conversation history and a sample of the dataset is shown in the Data tab. The dataset also has a new column that contains the predicted value of the column

Examples

To impute the missing values of a column called "Age" using the five closest neighbors in a column called "Gender," enter Impute missing values in column Age using information from Gender comparing the 5 closest neighbors.