Describe
Describe
helps you understand your data by showing some summary statistics about a given column or dataset. These statistics include:
- Count. The number of rows with valid values in the column.
- Unique. The number of unique values in the column.
- Mean. The average value (if applicable).
- Min. The smallest value in the column.
- Max. The largest value in the column.
- Representation. The type of the values in the column, such as integers, floats, or strings.
- Category. The column’s category, including:
- Display. How the values in the column are displayed, such as currency or percentages.
Note that Describe
does not change your dataset like some other skills.
Format
Describe
has several formats you can use to show metadata of a column or a whole dataset:
Describe the column <column name>
shows the summary statistics of the given column along with a distribution chart.Describe the dataset <dataset name>
shows the summary statistics of all columns in the given dataset.Describe the current dataset in detail
shows more detailed summary statistics of all columns in the current dataset.
Parameters
The parameters used to describe a dataset include:
Dataset name
orcolumn name
(required). The name of the dataset or column to describe.In detail
(optional). Provides a more detailed description of a dataset, including the following summary statistics:- Std. The standard deviation of the values in the column.
- 25%. The 25th percentile. 25 percent of the values in the column are below this value.
- Median. The 50th percentile. 50 percent of the values in the column are below this value and 50 percent of the values are above this value.
- 75%. The 75th percentile. 75 percent of the values in the column are below this value.
Output
If a column is successfully described,a two-column table with the column’s statistics appears in the conversation history. Also, a histogram is plotted if the given column is continuous, or a bar chart of counts is plotted if the given column is categorical.
If a dataset is successfully described or described in detail, the statistics for each column in the dataset are shown in a table. Note that clicking the name of any value in the Column column runs Describe
on that column.
In datasets larger than 100,000 rows, string-type columns display "N/A" instead of tallying the number of Unique values.
Examples
To describe a column, enter Describe the column <column name>
To describe dataset, enter Describe
or Describe the current dataset
To describe a named dataset, enter Describe the dataset <dataset name>
To describe the current dataset in detail, enter Describe the current dataset in detail
To describe a named dataset in detail, Describe the dataset <dataset name> in detail