Skip to main content
Version: 0.32.2

Investigate Customer Churn with Machine Learning

Being able to predict the future can give your business the edge it needs or make it easier to make decisions. In DataChat, you can quickly and easily create machine learning models with the Train skill. In this example, we'll investigate some customer data to find why customers churn and create a model that can help us predict whether a customer is likely to churn in the future.

Load Our Data

To start, Load the URL https://tinyurl.com/DataChatTraining/TelcoCustomers.csv.zip into the session. We're then given a sample of our dataset that looks something like this:

Telco dataset

Describe Our Data

Then, let's use the Describe skill to get a better understanding of what exactly we're working with by clicking Dataset > Describe in the sidebar.

Describe table

As we can see, there are 21 columns covering everything from the number of dependents a customer has to whether they use our company's online backup and security products.

Determine Why Customers Churn

The column that's most important to us in the Churn column. This column indicates whether the customer has churned, or left, our company. As a telecommunications company, we'd like to be able to see what might drive customers to leave and predict whether a current customer will churn in the future. We can use the Train form to do both:

  1. Click ML > Train Model to open the Train ML Model form.
  2. Enter "Churn" for the column.
  3. Click Submit.

Train form

The first thing we see is an impact chart that looks like this:

Telco Train bar chart

We can see that the model has found that the type of contract the customer is using has the biggest impact on whether the customer will churn. The next most impactful features include the customer's contract and their tenure (how long they've been a customer).

We can then investigate these findings a bit more using the charts Train created for us. From the tabs in the chart header, select Visualize and click Chart1C to plot a bubble chart that compares each type of contract against internet service type. Using that chart, we can see that it's specifically month-to-month contracts that lead to the most churn.

Telco bubble chart

From here, we can save our model and use it on future datasets (that don't have the Churn column) to predict whether the customers in that dataset will churn. Refer to Predict with an Existing Model for more information.

Full Recipe

Here is the entire recipe we used in this topic:

Load data from the URL <strong>https://tinyurl.com/DataChatTraining/TelcoCustomers.csv.zip</strong>
Describe the dataset current
Train an ML model on Churn with the config <strong class="TaggedSpec" aria-label="Click to Expand">{...}</strong> and generate charts for data visualization
Plot Chart Chart1C