Medical Insurance Cost Prediction — Machine Learning Case Study
2 min readAug 11, 2022
Medical insurance cost prediction using Machine Learning with Python. For this project, I have used Linear Regression model.
Table of Contents:
- Insurance Cost Data Collection link
- Data Analysis
- Data Preprocessing
- Train Test Split
- Linear Regression Model
- Trained Linear Regression Model
- Test using New Data
- Prediction
1. Insurance Cost Data Collection link
2. Data Analysis
- Age distribution
2. Gender Distribution
3. BMI Distribution
4. Children Distribution
5. Smoker Distribution
6. Region Distribution
7. Charges Distribution
3. Data Preprocessing
Encoding categorical features
After encoding
4. Splitting the Data into Train and Test
5. Model training
6. Model prediction
R squared value of Train data: 0.751505643411174
R squared value of Test: 0.7447273869684077
7. Building a Predictive System
The insurance cost is USD 3760.080576496055
GitHub Repo link: