Split
Split
lets you clean and explore your data by splitting a column into multiple new columns. You can split a column based on a delimiter or at a certain position, such as the second character or the third number in the column.
Format
Split has two formats:
Split <column> at the position <position> (calling the output columns <names>)
Split <column> by the delimiter <delimiter> (into columns <names>)
Parameters
Split uses the following parameters:
Column
(required). The column you want to split.Position
(required). The position at which you want to split the value. For example, if you want to split the value at the first character or number, enter 1.Delimiter
(required). The delimiter you want to use to split the values in the column. This can be a number, letter, or symbol. A space (“ “) is the default delimiter.Names
(optional). If you want to give the resulting columns custom names, enter the names as a comma-separated list.
Output
If the column is successfully split, the split columns are appended to your dataset and a sample of the updated dataset is shown in the Data tab and a success message is shown in the conversation history.
Otherwise, an error message is shown.
Examples
Consider a dataset called “Titanic” that contains information on each passenger, including the following columns:
Age
. Their age.Gender
. Their gender.Name
. Their name.PClass
. Their class.Cabin
. The passenger’s cabin ID.Survived
. Whether they survived the disaster.
Values in the Cabin column are often listed as “level-room number,” such as A-23. To split the Cabin column into a CabinLevel and CabinNumber column, enter Split Cabin by the delimiter - into columns CabinLevel, CabinNumber
.