Telecom Customer Churn Prediction
Project Overview
The aim of this project is to analyze customer demographics, services, tenure, and other variables to predict whether a particular customer will churn or not.
Data Dictionary
Variable | Description |
---|---|
CustomerID | Unique customer ID |
Gender | Customer's gender |
SeniorCitizen | Whether the customer is a senior citizen or not (1, 0) |
Partner | Whether the customer has a partner or not (Yes, No) |
Dependents | Whether the customer has dependents or not (Yes, No) |
Tenure | Number of months the customer has stayed with the company |
PhoneService | Whether the customer has a phone service or not (Yes, No) |
MultipleLines | Whether the customer has multiple lines or not (Yes, No, No phone service) |
InternetService | Customer’s internet service provider (DSL, Fiber optic, No) |
OnlineSecurity | Whether the customer has online security or not (Yes, No, No internet service) |
OnlineBackup | Whether the customer has online backup or not (Yes, No, No internet service) |
DeviceProtection | Whether the customer has device protection or not (Yes, No, No internet service) |
TechSupport | Whether the customer has tech support or not (Yes, No, No internet service) |
StreamingTV | Whether the customer has streaming TV or not (Yes, No, No internet service) |
StreamingMovies | Whether the customer has streaming movies or not (Yes, No, No internet service) |
Contract | The contract term of the customer (Month-to-month, One year, Two year) |
PaperlessBilling | Whether the customer has paperless billing or not (Yes, No) |
PaymentMethod | The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic)) |
MonthlyCharges | The amount charged to the customer monthly |
TotalCharges | The total amount charged to the customer |
Churn | Whether the customer churned or not (Yes or No) |
Conclusion
From the exploratory data analysis, I came to know that senior citizens have a lower churn count, whereas customers who are single or don't have dependents have a higher churn count. In addition to that, customers are more satisfied with the streaming services than other services such as Online backup and Device protection, which has resulted in a lower churn count in customers with streaming services than other services.
The tenure has an inverse relation with churn count, where customers with tenure shorter than 5 months have a higher churn count. Moreover, customers with a month-to-month contract have a higher churn count compared to those with one or two-year contracts, which also proves that customers who have a longer contract with the company have a lower churn count.
It has been observed that customers with higher monthly charges and lower total charges have a higher churn count. Therefore, the company should focus on lowering the monthly charges for the customers in order to reduce the churn count. From the feature importance, it is clear that the tenure, contract, monthly charges, and total charges are the most important features for predicting customer churn. Therefore, the company should focus on these features to reduce customer churn.
Coming to the machine learning models, I have used three models - Decision Tree Classifier, Random Forest Classifier, and K Nearest Neighbors Classifier. The Random Forest Classifier has the highest accuracy i.e. 82% and F1 Score, and lowest mean squared error, mean absolute error. Therefore, the Random Forest Classifier is a good fit for predicting customer churn.
View Notebook Download Dataset