A Hybrid Machine Learning Framework for Water Quality Index Prediction Using Feature-Based Neural Network Initialization

Project Code :TCMAPY2471

Objective

This project develops a hybrid machine learning framework for predicting the Water Quality Index (WQI) in freshwater ecosystems by combining multiple advanced models—Random Forest, XGBoost, Stacking Regressor, Voting Regressor, HistGradient Boosting, and a custom Deep Residual Network—while using SHAP (Shapley Additive Explanations) for feature-informed neural network initialization and model interpretability. The system processes water quality parameters (e.g., pH, dissolved oxygen, turbidity) through data preprocessing, model training with hyperparameter optimization, and real-time prediction, achieving high accuracy (Deep Residual Network R² = 0.9872) and reduced error margins. Designed for scalability and transparency, the framework enables environmental managers to make data-driven decisions for water resource management, with future enhancements including IoT integration and deep learning extensions.

Abstract

Accurate prediction of the Water Quality Index (WQI) is crucial for safeguarding public health and managing freshwater resources effectively. Traditional machine learning models often face challenges due to arbitrary weight initialization and the limited use of ensemble learning, leading to unstable predictions and reduced interpretability. In this study, we introduce a novel hybrid machine learning framework designed to address these issues by combining feature-informed neural network initialization with the power of ensemble methods such as XGBoost, Random Forest, Stacking, Hist-Gradient Boosting, Deep Residual Network and Voting Regressor. The neural network weights are initialized using feature significance scores derived from SHapley Additive exPlanations (SHAP), which provide a more informed and interpretable starting point. Predictions are then refined iteratively using the ensemble methods to enhance model performance and robustness. This framework leverages the strengths of both feature-based initialization and ensemble learning to produce more reliable and interpretable predictions of the WQI, paving the way for better water quality management and resource protection.

Keywords: Water Quality Index (WQI), Hybrid Machine Learning Framework, Feature-Informed Neural Network Initialization, XGBoost, Random Forest, Stacking, Voting Regressor, Shapley Additive explanations (SHAP), Ensemble Learning, Interpretability, Water Resource Management

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

3.1 SOFTWARE REQUIREMENS

Operating System                               :  Windows 7/8/10

Server side Script                               :  HTML, CSS, Bootstrap & JS

Programming Language                     :  Python

Libraries                                             : Flask,Torch, Keras, Pandas,Json, ,                                                                                                   Numpy , Seaborn

IDE/Workbench                                  :  VSCode

Server Deployment                             :  Xampp Server

Database                                             :  SQLite  

 

3.2 HARDWARE REQUIREMENTS

Processor                                  - I3/Intel Processor

RAM                                       - 8GB (min)

Hard Disk                                - 128 GB

Key Board                               - Standard Windows Keyboard

Mouse                                      - Two or Three Button Mouse

Monitor                                    - Any

Demo Video

mail-banner
call-banner
contact-banner
Request Video