Heart Stroke
Prediction

Project Overview

This data science project aims to predict the likelihood of a patient experiencing a stroke based on various input parameters such as gender, age, presence of diseases, and smoking status. The dataset provides relevant information about each patient, enabling the development of a predictive model.

About the Dataset:

The dataset used in this project contains information necessary to predict the occurrence of a stroke. Each row in the dataset represents a patient, and the dataset includes the following attributes:

id: Unique identifier

gender: "Male", "Female", or "Other"

age: Age of the patient

hypertension: 0 if the patient doesn't have hypertension, 1 if the patient has hypertension

heart_disease: 0 if the patient doesn't have any heart diseases, 1 if the patient has a heart disease

ever_married: "No" or "Yes"

work_type: "Children", "Govt_job", "Never_worked", "Private", or "Self-employed"

Residence_type: "Rural" or "Urban"

avg_glucose_level: Average glucose level in the blood

bmi: Body mass index

smoking_status: "Formerly smoked", "Never smoked", "Smokes", or "Unknown"

stroke: 1 if the patient had a stroke, 0 if not

Impact

According to the World Health Organization (WHO), stroke is the second leading cause of death worldwide, responsible for approximately 11% of total deaths. This project aims to leverage machine learning techniques to build a predictive model that can identify individuals at risk of stroke based on their demographic and health-related features. By detecting high-risk individuals early, appropriate preventive measures can be taken to reduce the incidence and impact of stroke.

To enhance the accuracy of the stroke prediction model, the dataset will be analyzed and processed using various data science methodologies and algorithms.

View Notebook Download Dataset