Skip to main content
Version: 0.35.7


The Cluster skill lets you divide groups of abstract data into classes of similar data. Unlike [Train], the Cluster skill doesn't require a labeled dataset.


Cluster has a single format with several variations: Cluster data (excluding | including) <columns> (setting number of clusters to <number of clusters>) (using <model>).


Cluster uses the following parameters:

  • columns (optional). The columns to exclude or include in your cluster. Not selecting a column will use all columns in the dataset to cluster data.
  • number of clusters (optional). The number of clusters to create.
  • model (optional). Using the Hierarchical Clustering Model or the KMeans Clustering Model. Not selecting a model will use both to cluster data.


If data is successfully clustered, a preview message appears in the chat history. This message contains a link to preview the ClusterResults dataset. An output also appears in the visualization panel displaying the Cluster Centroids, Models, Scores, and Pipeline Report.


To cluster the "Titanic Dataset" in it's entirety, enter Cluster data.

To cluster the Age, Fare, and Gender columns, enter Cluster data including Age, Fare, Gender setting number of clusters to 3 using Hierarchical Clustering Model.