Skip to main content

Summarize Large Datasets with Pivot

Summarizing large datasets can help you to better understand, identify, and compare specific aspects of your data. In this example, we'll demonstrate how to create pivot tables to summarize large datasets.

Load Our Data

To start, Load the "BikeShare.xlsx Dataset" into your session. We're given a sample of our data that looks something like this:

Bikeshare Data

Extract Our Data

Let's prepare our data by using the Extract skill to find the week of the year for each row. Later, we'll use this information to dig into the data further.

  1. Click Add Column > Extract in the skill menu.
  2. Enter "date" for the column, "week" for the date parts to extract, and "Week" for the name of the extracted column.
  3. Click Submit.

Extract form

This gives us the following table. As we can see, the dates of recorded registered riders are now listed with their corresponding week.

extracted weeks table

Create a Pivot Table

Currently, our dataset has over 17,000 rows. Let's create a pivot table to make this dataset more manageable. Let's say that we'd like to show the total number of registered riders for each week and weather situation.

  1. Click Aggregate > Pivot in the skill menu.
  2. Select "total" for the aggregate.
  3. Select "registeredRiders" for the aggregate column.
  4. Select "Week" for the pivot rows.
  5. Select "weatherSituation" for the pivot columns.
  6. Click Submit

Pivot form

This creates a new pivot table in the Chart tab, "totalRegisteredRidersPivot", that summarizes the total number of registered riders for each week in each weather situation.

Pivot table