Skip to main content
Version: 0.28.2

Explore

After you've loaded data into your session, explore your data before diving in to analysis. By exploring your data first, you can better understand the types of data you're working with, the quality of your data, and some general statistics about your data.

note

When a skill is applied to a dataset:

  • If the skill creates a new dataset, it will use the convention [dataset]_[Skill].
  • If the skill alters your existing dataset, it will use the convention [dataset] v[x] to save to a new version.

Interactive Dataset Panel

The Dataset Panel provides a number of ways to adjust your data.

Adjust Column Width

You can adjust the width of columns in the dataset panel by clicking on the divider between columns and dragging to expand or compress the column. You can also double-click the divider between columns to automatically resize the column to fit its contents.

column width

Expand Cell Contents

If the contents of a cell are truncated (indicated by an ellipsis [...]), you can click it to expand it vertically without needing to change the column width.

expand cell vertically

Rename Columns

To rename a column, double-click the column name and enter the new name.

rename-column

Use the More Options Menu

The More options menu has a couple options to explore and organize with your data. From top to bottom, you can:

  • Sort columns in ascending or descending order.
  • Hide columns.
  • Format numeric columns.
  • Train a column
  • Change the column type.
  • Describe a column.
  • Drop a column.
  • Rename a column.

More menu

Interactive Display Panel

The display panel provides a number of way to interact with your objects.

Minimize and Expand Objects

Objects in the display panel, such as charts and tables, can be minimized or expanded as you work in DataChat. You can minimize an object by clicking More options > Minimize <object>. To expand a minimized object, click More options > Expand. You can also select multiple objects to minimize or expand at once by selecting the objects' checkboxes then click Minimize or Expand in the sidebar.

minimize display panel

View Objects in a Larger Window

To view objects from the display panel in a larger window click View in a larger window.

view in a larger window

Describe

tip

Use the Describe skill to confirm that each column's type is correct before continuing your analysis.

The Describe skill provides summary statistics and details about your dataset.

Describe a Dataset

There are a couple of ways to show summary statistics about your data. The quickest way is to click the Show Descriptive Statistics button in a table's header:

explore column header

This expands each column to display quick statistics about the values of each column in your dataset.

Clicking Dataset > Describe from the sidebar opens a popup table that shows statistics about each column in the current dataset along with counts, unique counts, and column types. The table is named "<dataset name>_Describe". The columns are listed in the same order as the columns in the dataset. Once you close the popup to continue working with your data, the output table appears in the chat history.

describe table

You can also enter in the GEL input field:

  • Describe, which operates on the current dataset
  • Describe the dataset <dataset name>, to specify a dataset.

Describe a Column

There are several ways to view further details about a column.

  • From the table generated by Describe, click on the link for each column to generate a popup that shows the distribution chart of the column, along with further details. Once you close the popup to continue working with your data, the distribution chart appears in the chat history.

  • Click Column > Describe in the sidebar and choose a column from the current dataset.

  • In the dataset panel, click the three-button menu and then click Describe.

  • In the GEL input field, enter: Describe the column <column name>.

If a column has few unique values (low cardinality), such as a Boolean column, a donut chart containing the count of records for each unique value is returned along with a table containing detailed statistics.

Describe a Dataset in Detail

You can also describe a dataset in detail, to see further in-depth summary statistics on your data.

Enter in the GEL input field:

  • Describe the current dataset in detail
  • Describe the dataset <dataset name> in detail

Preview

Display a portion of a given dataset with Preview. The output dataset appears as a new dataset: <dataset name>_Preview and doesn't change the current dataset. You can preview the entire dataset, a portion of the dataset (a random sample or percentage), or rows that meet a given condition. By default, a portion of the original order of the dataset is displayed.

To preview your data, enter in the GEL input field: Preview the dataset <dataset name>. A popup appears that shows the preview. When you close the popup, the table appears in the chat history. To apply more options, see Preview.

Sample

Display a portion of a given dataset with Sample. The output dataset appears as a new dataset: <dataset name>_Sample and changes the current dataset to the name of the new dataset. You can sample the entire dataset, a portion of the dataset (a random sample or a percentage), or rows that meet a given condition. By default, a portion of the original order of the dataset is displayed.

To sample the current dataset, click Dataset > Sample in the sidebar. The Sample form opens:

  1. Select either a number of rows or a percentage of the dataset.

  2. Enter the number of rows or the percentage of the dataset to sample.

  3. Select either random or sequential sampling. For sequential sampling, the sample starts with the first row and moves down the dataset until the row or percentage limit is met.

  4. Optionally, click Conditions to specify which rows to sample.

  5. Optionally, if you've added a condition, to add another condition, click Add Another Option.

  6. Click Submit.

    Dataset Sample form

You can also enter in the GEL input field: Sample the dataset <dataset name>. To apply more options, see Sample.

Click Dataset > Search:

  1. Select whether to search all datasets or a single dataset.

  2. If you choose to search a single dataset, select the dataset to search.

  3. Choose whether to search for a specific string or a pattern.

  4. Enter the string or pattern.

  5. Click Submit.

    the Search form

The Search skill has many variations. You can search the columns in a specific dataset or across all datasets to find rows with data that fit certain criteria.