The primary objective of this project is to develop a robust and scalable predictive modeling framework capable of accurately estimating individual pharmacy costs, particularly for high-utilizer patients within the Medicaid rebate program. By leveraging a diverse set of machine learning algorithms—namely Random Forest, XGBoost, CatBoost, and LightGBM—along with a hybrid deep learning architecture combining Autoencoders and Long Short-Term Memory (LSTM) networks, the project aims to capture both static and temporal features of patient data. This predictive insight is intended to support healthcare providers, policymakers, and Medicaid program administrators in identifying cost-intensive cases early, optimizing pharmaceutical budgeting, and implementing data-driven interventions.
Accurately forecasting individual pharmacy costs is crucial for resource planning and policy decision-making in healthcare systems. This study proposes a set of scalable hybrid deep learning and ensemble models aimed at predicting the total pharmacy expenses for patients identified as high utilizers within the Medicaid rebate program. Leveraging the comprehensive dataset available from the Healthcare Cost Prediction Dataset, this work evaluates and compares multiple advanced regression models including Random Forest Regressor, XGBoost Regressor, CatBoost Regressor, and LightGBM Regressor.
To enhance predictive accuracy and capture non-linear temporal dependencies in patient health profiles, we further integrate a hybrid Autoencoder-LSTM model. The autoencoder serves to extract compact, noise-reduced representations of patient features, while the LSTM model captures temporal dynamics and sequential patterns. Our experimental results demonstrate that hybrid models consistently outperform traditional regression approaches in terms of RMSE and MAE metrics, especially for patients with highly variable medication histories. The proposed framework is designed to be scalable and adaptable to other insurance claims data, thus offering potential utility for state-level Medicaid programs and pharmaceutical budget planning. Additionally, the interpretability of tree-based models aids in identifying key drivers of pharmacy cost, contributing to transparent and explainable healthcare analytics.
Keywords: Pharmacy Cost Prediction, Medicaid Rebate Program, High Utilizers, Autoencoder-LSTM, Random Forest, XGBoost, CatBoost, LightGBM, Healthcare Analytics, Scalable Deep Learning.
NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Hardware Requirements
Hard Disk - 160GB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
RAM - 8GB
Software Requirements:
Operating System : Windows 7/8/10
Server side Script : HTML, CSS, Bootstrap & JS
Programming Language : Python
Libraries : Django, Pandas, Numpy, Tensorflow, Scikit-learn.
IDE/Workbench : VS Code
Technology : Python 3.10
Database : SQLite