Split lets you clean and explore your data by splitting a column into multiple new columns. You can split a column based on a delimiter or at a certain position, such as the second character or the third number in the column.
Split has two utterance variations:
Split <column> at the position <position> (calling the output columns <names>)
Split <column> by the delimiter <delimiter> (calling the output columns <names>)
Split uses the following parameters:
Column(required). The column you want to split.
Position(required). The position at which you want to split the value. For example, if you want to split the value at the first character or number, enter 1.
Delimiter(required). The delimiter you want to use to split the values in the column. This can be a number, letter, or symbol. A space (“ “) is the default delimiter.
Names(optional). If you want to give the resulting columns custom names, enter the names as a comma-separated list.
If the column is successfully split, the split columns are appended to your dataset and a sample of the updated dataset is shown in the display panel and a success message is shown in the chat box.
Otherwise, an error message is shown in the chat box.
Consider a dataset called “Titanic” that contains information on each passenger, including the following columns:
Age. Their age.
Gender. Their gender.
Name. Their name.
PClass. Their class.
Cabin. The passenger’s cabin ID.
Survived. Whether they survived the disaster.
Values in the Cabin column are often listed as “level-room number,” such as A-23. To split the Cabin column into a CabinLevel and CabinNumber column, enter
Split Cabin by the delimiter - calling the output columns CabinLevel, CabinNumber.