Investigate Customer Churn with Machine Learning
Being able to predict the future can give your business the edge it needs or make it easier to make decisions. In DataChat, you can quickly and easily create machine learning models with the Train
skill. In this example, we'll investigate some customer data to find why customers churn and create a model that can help us predict whether a customer is likely to churn in the future.
Load Our Data
To start, Load
the URL https://tinyurl.com/DataChatTraining/TelcoCustomers.csv.zip into the session. We're then given a sample of our dataset that looks something like this:
Describe Our Data
Then, let's use the Describe
skill to get a better understanding of what exactly we're working with by clicking Dataset > Describe in the sidebar.
As we can see, there are 21 columns covering everything from the number of dependents a customer has to whether they use our company's online backup and security products.
Determine Why Customers Churn
The column that's most important to us in the Churn column. This column indicates whether the customer has churned, or left, our company. As a telecommunications company, we'd like to be able to see what might drive customers to leave and predict whether a current customer will churn in the future. We can use the Train
form to do both:
- Click ML > Train Model to open the Train ML Model form.
- Enter "Churn" for the column.
- Click Submit.
The first thing we see is an impact chart that looks like this:
We can see that the model has found that the type of contract the customer is using has the biggest impact on whether the customer will churn. The next most impactful features include the customer's contract and their tenure (how long they've been a customer).
We can then investigate these findings a bit more using the charts Train
created for us. From the tabs in the chart header, select Visualize and click Chart1C to plot a bubble chart that compares each type of contract against internet service type. Using that chart, we can see that it's specifically month-to-month contracts that lead to the most churn.
From here, we can save our model and use it on future datasets (that don't have the Churn column) to predict whether the customers in that dataset will churn. Refer to Predict with an Existing Model for more information.
Full Recipe
Here is the entire recipe we used in this topic:
Load data from the URL <strong>https://tinyurl.com/DataChatTraining/TelcoCustomers.csv.zip</strong>
Describe the dataset current
Train an ML model on Churn with the config <strong class="TaggedSpec" aria-label="Click to Expand">{...}</strong> and generate charts for data visualization
Plot Chart Chart1C