Machine Learning-Based Life Expectancy Prediction in Developed and Developing Regions

Project Code :TCMAPY2203

Objective

The objective of the Life Expectancy Prediction System is to predict life expectancy using machine learning models like GBM, LightGBM, and KNN, based on the provided dataset of socio-economic and health-related factors. The system integrates features such as GDP, education, and healthcare access for accurate predictions. Explainable AI (LIME) is used to provide transparent and interpretable results. The goal is to assist in healthcare and policy decision-making by providing data-driven insights.

Abstract

This project proposes a machine learning-based framework for predicting life expectancy in both developed and developing regions, leveraging Gradient Boosting Machine (GBM), LightGBM, and K-Nearest Neighbors (KNN) models. The models are trained on a dataset that includes various socio-economic and health indicators such as Adult Mortality, Alcohol consumption, Hepatitis B, Polio, GDP, and Schooling. A key aspect of the project is the use of Explainable AI (XAI) with Local Interpretable Model-agnostic Explanations (LIME) to provide transparency and interpretability in the prediction process. The data undergoes preprocessing steps such as handling missing values, feature engineering, and resampling to address class imbalance. The Gradient Boosting Machine model achieved an R2 score of 0.9861, while LightGBM and KNN models showed R2 scores of 0.9902 and 0.9908, respectively. A Flask-based web application is developed to allow users to interact with the model, providing modules such as Home, Register, Login, Classification, and Logout. This work demonstrates the potential of machine learning models to accurately predict life expectancy while ensuring the explainability of predictions using XAI techniques.

Keywords: Life expectancy, Machine Learning, Gradient Boosting Machine, LightGBM, KNN, XAI, LIME, Data Preprocessing, Flask, Predictive Modeling

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

The hardware requirements specify the physical resources necessary to run the system efficiently. For this life expectancy prediction system, the following are the recommended hardware specifications:

  • Processor: Intel Core i5 or higher (preferably with multiple cores for handling data processing and model training)
  • Hard Disk: 250GB or higher (preferably SSD for faster read/write operations)
  • RAM: 16GB or more (for smooth handling of large datasets and simultaneous processes)
  • Keyboard: Standard Windows Keyboard (or any keyboard compatible with the system)
  • Mouse: Two or Three Button Mouse
  • Monitor: SVGA or higher resolution (at least 1080p for clarity when viewing predictions and visualizations)
  • Graphics Card: Dedicated GPU (NVIDIA or AMD, 2GB or more) for faster model training (optional, but recommended for deep learning tasks)
  • Network: Stable internet connection (for web-based deployment and user interaction)
  • Web Server: Local or cloud-based server (e.g., AWS, Azure) to deploy the Flask web application.

Software Requirements

The software requirements specify the environment and tools necessary to develop, run, and deploy the system. The required software components for this life expectancy prediction system are as follows:

  • Operating System: Windows 7/8/10, Linux, or macOS
  • Programming Language: Python 3.x
  • Libraries:
    • Pandas: For data manipulation, cleaning, and analysis.
    • Numpy: For numerical operations, handling large multidimensional arrays, and performing matrix operations.
    • scikit-learn: For implementing machine learning models, feature selection, and evaluation metrics.
    • XGBoost: For building and training Gradient Boosting Machine models.
    • LightGBM: For training LightGBM models, optimized for large datasets.
    • K-Nearest Neighbors (KNN): For implementing KNN model-based predictions.
    • LIME: For implementing Local Interpretable Model-agnostic Explanations to interpret model predictions.
    • Flask: For developing the web application to deploy the system.
    • Matplotlib/Seaborn: For data visualization, including charts and graphs for model evaluation and predictions.
    • MySQL/SQLite: For database management to store user data and prediction results.
  • IDE/Workbench: Visual Studio Code, PyCharm, or Jupyter Notebooks for Python development and experimentation.

Β·         IDE/Workbench:

o    Visual Studio Code: A lightweight and versatile code editor with Python support.

o    PyCharm: An IDE optimized for Python development with features like code completion, debugging, and project management.

o    Jupyter Notebooks: For experimenting, running code in cells, and visualizing data and model results interactively.

Demo Video

mail-banner
call-banner
contact-banner
Request Video