Warranty Claims Fraud Prediction
Project Overview
The aim of this project is to analyze the warranty claims based on their region, product, claim value, and other features to predict their authenticity. The dataset is taken from Kaggle and contains 358 rows and 21 columns.
Data Dictionary
Column Name | Description |
---|---|
Unnamed: 0 | Index |
Region | Region of the claim |
State | State of the claim |
Area | Area of the claim |
City | City of the claim |
Consumer_profile | Consumer profile Business/Personal |
Product_category | Product category Household/Entertainment |
Product_type | Product type AC/TV |
AC_1001_Issue | 0- No issue / No componenent, 1- repair, 2-replacement |
AC_1002_Issue | 0- No issue / No componenent, 1- repair, 2-replacement |
AC_1003_Issue | 0- No issue / No componenent, 1- repair, 2-replacement |
TV_2001_Issue | 0- No issue / No componenent, 1- repair, 2-replacement |
TV_2002_Issue | 0- No issue / No componenent, 1- repair, 2-replacement |
TV_2003_Issue | 0- No issue / No componenent, 1- repair, 2-replacement |
Claim_Value | Claim value in INR |
Service_Center | Service center code |
Product_Age | Product age in days |
Purchased_from | Purchased from - Dealer, Manufacturer, Internet |
Call_details | Call duration |
Purpose | Purpose of the call |
Fraud | Fraudulent (1) or Genuine (0) |
Conclusion
From the exploratory data analysis, it was concluded that fraudulent claims tend to have higher claim values, and certain regions and purchase methods are associated with a higher likelihood of fraudulent claims.
Notable findings from the analysis include:
- Warranty claims are most frequent in the southern region of India, particularly in Andhra Pradesh and Tamil Nadu.
- Fraudulent claims are more common in urban regions like Hyderabad and Chennai.
- TVs had higher warranty claims when purchased for personal use compared to ACs.
- Fraudulent claims for ACs were made even when there were no issues with AC parts, while in the case of TVs, fraudulent claims occurred both with and without issues in TV parts.
- Fraudulent claims were more frequent when purchases were made directly through the manufacturer.
- Fraudulent claims tend to have a higher claim value as compared to genuine claims.
- Service center 13 had the highest number of fraudulent claims despite having fewer total warranty claims.
- Fraudulent claims were more frequent when the customer care call duration was less than 3-4 minutes.
Machine learning models, including Decision Tree Classifier, Random Forest Classifier, and Logistic Regression, achieved high accuracy, but there is room for improvement with more data.
View Notebook Download Dataset