Skip to main content
Version: 0.32.2

Cluster

The Cluster skill lets you divide groups of abstract data into classes of similar data. Unlike [Train], the Cluster skill doesn't require a labeled dataset.

Format

Cluster has a single format with several variations: Cluster data (excluding | including) <columns> (setting number of clusters to <number of clusters>) (using <model>).

Parameters

Cluster uses the following parameters:

  • columns (optional). The columns to exclude or include in your cluster. Not selecting a column will use all columns in the dataset to cluster data.
  • number of clusters (optional). The number of clusters to create.
  • model (optional). Using the Hierarchical Clustering Model or the KMeans Clustering Model. Not selecting a model will use both to cluster data.

Output

If data is successfully clustered, a preview message appears in the chat history. This message contains a link to preview the ClusterResults dataset. An output also appears in the visualization panel displaying the Cluster Centroids, Models, Scores, and Pipeline Report.

Examples

To cluster the "Titanic Dataset" in it's entirety, enter Cluster data.

To cluster the Age, Fare, and Gender columns, enter Cluster data including Age, Fare, Gender setting number of clusters to 3 using Hierarchical Clustering Model.