Describe helps you understand your data by showing some summary statistics about a given column or dataset. These statistics include:
- Count. The number of rows with valid values in the column.
- Unique. The number of unique values in the column.
- Mean. The average value (if applicable).
- Min. The smallest value in the column.
- Max. The largest value in the column.
- Representation. The type of the values in the column, such as integers, floats, or strings.
- Category. The column’s category, including:
- Display. How the values in the column are displayed, such as currency or percentages.
Describe does not change your dataset like some other skills.
Describe has several formats you can use to show metadata of a column or a whole dataset:
Describe the column <column name>shows the summary statistics of the given column along with a distribution chart.
Describe the dataset <dataset name>shows the summary statistics of all columns in the given dataset.
Describe the current dataset in detailshows more detailed summary statistics of all columns in the current dataset.
The parameters used to describe a dataset include:
column name(required). The name of the dataset or column to describe.
In detail(optional). Provides a more detailed description of a dataset, including the following summary statistics:
- Std. The standard deviation of the values in the column.
- 25%. The 25th percentile. 25 percent of the values in the column are below this value.
- Median. The 50th percentile. 50 percent of the values in the column are below this value and 50 percent of the values are above this value.
- 75%. The 75th percentile. 75 percent of the values in the column are below this value.
If a column is successfully described,a two-column table with the column’s statistics appears in the log. Also, a histogram is plotted if the given column is continuous, or a bar chart of counts is plotted if the given column is categorical.
If a dataset is successfully described or described in detail, the statistics for each column in the dataset are shown in a table. Note that clicking the name of any value in the Column column runs
Describe on that column.
In datasets larger than 100,000 rows, string-type columns display "N/A" instead of tallying the number of Unique values.
To describe a column, enter
Describe the column <column name>
To describe dataset, enter
Describe the current dataset
To describe a named dataset, enter
Describe the dataset <dataset name>
To describe the current dataset in detail, enter
Describe the current dataset in detail
To describe a named dataset in detail,
Describe the dataset <dataset name> in detail