Skip to main content

Split

Split lets you clean and explore your data by splitting a column into multiple new columns. You can split a column based on a delimiter or at a certain position, such as the second character or the third number in the column.

Format

Split has two formats:

  • Split <column> at the position <position> (calling the output columns <names>)
  • Split <column> by the delimiter <delimiter> (into columns <names>)

Parameters

Split uses the following parameters:

  • Column (required). The column you want to split.
  • Position (required). The position at which you want to split the value. For example, if you want to split the value at the first character or number, enter 1.
  • Delimiter (required). The delimiter you want to use to split the values in the column. This can be a number, letter, or symbol. A space (“ “) is the default delimiter.
  • Names (optional). If you want to give the resulting columns custom names, enter the names as a comma-separated list.

Output

If the column is successfully split, the split columns are appended to your dataset and a sample of the updated dataset is shown in the Data tab and a success message is shown in the conversation history.

Otherwise, an error message is shown.

Examples

Consider a dataset called “Titanic” that contains information on each passenger, including the following columns:

  • Age. Their age.
  • Gender. Their gender.
  • Name. Their name.
  • PClass. Their class.
  • Cabin. The passenger’s cabin ID.
  • Survived. Whether they survived the disaster.

Values in the Cabin column are often listed as “level-room number,” such as A-23. To split the Cabin column into a CabinLevel and CabinNumber column, enter Split Cabin by the delimiter - into columns CabinLevel, CabinNumber.