Skip to main content
Version: 0.32.2

Guided Learning

When we use DataChat, analysis is augmented with visualization and vice versa. Visualizations provide an easier way to understand your data and simplify how you share your insights with others.

We'll step through how to build and annotate a chart from datasets.

Open the BikeShare Dataset

Open a new session from the homepage, download the dataset "BikeShare dataset (Excel)" from DataChat Training, and load it into a new session.

bike share data

BikeShare.xlsx contains the datasets "BikeShare", "SeasonDecode", and "WeatherDecode". When an Excel file loads into a new session, each sheet loads as a separate dataset. The "All Datasets" table displays all the datasets loaded into the session. Once we close the "All Datasets" table, the "BikeShare" dataset displays in the dataset panel of Grid mode.

The BikeShare dataset captures the activity of bike riders during the period Jan. 1, 2011 through Feb. 26, 2012. It also captures weather and seasonal data.

Explore Your Data with Chart Builder

The Chart Builder provides an interactive interface in which to refine our analysis by visualization. To open the Chart Builder, click Plot in the sidebar.

chart builder open

Now, let's ask a few questions about our data.

When do people ride their bikes?

In the Chart Builder, select "hour" for the X-Axis and "allRiders" for the Y-Axis.

By default, the ChartBuilder starts with the scatter chart option. You can also hover over the options to see what other plot types are available.

Bike Share Plot with Scatter button

In the chart display panel, hover over the lines for each hour to see the individual data points that make up each line.

BikeShare Plot with Scatter

We can see the lowest number of riders at 4 am, ramping up to one local maximum at 8 am, and an overall maximum at 5 pm two peaks per weekday. This corresponds to what we'd guess for rush-hour travel.

Let's change the chart type to "Violin".

BikeShare Plot with Violin

The violin chart gives a more nuanced perspective of the distribution of riders. But for now, let's return to the scatter chart.

Does it look the same on holidays?

Under Optional Fields, select "holiday" from Subplot.

BikeShare Plot with Scatter and Subplot

We can see that on holidays, riders start riding about the same time, but stay out later.

Setting a subplot separates the main plot into subplots based on the values of the column chosen in Subplot. In this case, holiday is Integer type. We can click Describe at the bottom of the chart display to verify.

BikeShare plot with describe

Looking at the chart titles, we can see that the leftmost chart includes points where holiday is 0, while for the rightmost chart, holiday is 1.

Do registered riders and casual riders have different hours?

Let's change Y-Axis to "registeredRiders".

We can see that the peaks we saw with "allRiders" remain during rush hour.

BikeShare Plot with Scatter and Subplot, registered Riders

We can also change Y-Axis to "casualRiders".

The profile of rides change from rush-hour peaks to a bell curve that shifts to rise between 10 AM and 8 PM, with a maximum around 2 PM.

BikeShare Plot with Scatter and Subplot, casual Riders

Note how the number of casualRiders is far fewer than the total number of allRiders. If we used Dataset > Describe during exploration, we would see that there are 322 unique casualRiders and 776 unique registeredRiders with a total of 1098 unique allRiders.

What does the weather look like when people ride?

Let's create a couple more charts to investigate how weather conditions impact riders. Select Single Metric Chart and enter "relativeHumidity" for the column.

Bikeshare humidity chart

This chart displays a single value, in this case, the average relative humidity at 63%. By default, the column selected typically defaults to average for the aggregate. To change the aggregate value, click the column name and select a new aggregate.

Bikeshare humidity minimum

Let's try a different chart type. Click Line Chart and enter "temperatureRelativeto41C" for the X-Axis and "allRiders", "casualRiders", and "registeredRiders" for the Y-Axis.

Bikeshare temperature

Here, we can see that as temperature increases, so does ridership across all types of riders. We can also see that both registered riders and all riders decreases significantly after the temperature gets too warm.

Let's create one last chart to explore weather situations. Select Stacked Bar Chart then enter "weatherSituation" for the X-Axis, average "allRiders" for the Y-Axis, and "seasonCode" for the Partition.

Bikeshare weather

This chart reveals to us the average riders in each season for each weather situation. We can see that seasons 2 and 3, summer and fall, have the highest ridership for weather situations 1 and 2, clear and cloudy weather, with almost no ridership for weather situation 4, heavy rain.

So far, we have been building the chart within the Chart Builder. When it's in a form we want to save, click Submit.