Scalable Hybrid Deep Models for Individual Pharmacy Cost Prediction

Project Code :TCMAPY1758

Objective

The primary objective of this project is to develop a robust and scalable predictive modeling framework capable of accurately estimating individual pharmacy costs, particularly for high-utilizer patients within the Medicaid rebate program. By leveraging a diverse set of machine learning algorithms—namely Random Forest, XGBoost, CatBoost, and LightGBM—along with a hybrid deep learning architecture combining Autoencoders and Long Short-Term Memory (LSTM) networks, the project aims to capture both static and temporal features of patient data. This predictive insight is intended to support healthcare providers, policymakers, and Medicaid program administrators in identifying cost-intensive cases early, optimizing pharmaceutical budgeting, and implementing data-driven interventions.

Abstract

Accurately forecasting individual pharmacy costs is crucial for resource planning and policy decision-making in healthcare systems. This study proposes a set of scalable hybrid deep learning and ensemble models aimed at predicting the total pharmacy expenses for patients identified as high utilizers within the Medicaid rebate program. Leveraging the comprehensive dataset available from the Healthcare Cost Prediction Dataset, this work evaluates and compares multiple advanced regression models including Random Forest Regressor, XGBoost Regressor, CatBoost Regressor, and LightGBM Regressor.

To enhance predictive accuracy and capture non-linear temporal dependencies in patient health profiles, we further integrate a hybrid Autoencoder-LSTM model. The autoencoder serves to extract compact, noise-reduced representations of patient features, while the LSTM model captures temporal dynamics and sequential patterns. Our experimental results demonstrate that hybrid models consistently outperform traditional regression approaches in terms of RMSE and MAE metrics, especially for patients with highly variable medication histories. The proposed framework is designed to be scalable and adaptable to other insurance claims data, thus offering potential utility for state-level Medicaid programs and pharmaceutical budget planning. Additionally, the interpretability of tree-based models aids in identifying key drivers of pharmacy cost, contributing to transparent and explainable healthcare analytics.

Keywords: Pharmacy Cost Prediction, Medicaid Rebate Program, High Utilizers, Autoencoder-LSTM, Random Forest, XGBoost, CatBoost, LightGBM, Healthcare Analytics, Scalable Deep Learning.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

Hardware Requirements

Processor                                 - I3/Intel Processor

Hard Disk                                - 160GB

Key Board                              - Standard Windows Keyboard

Mouse                                     - Two or Three Button Mouse

Monitor                                   - SVGA

RAM                                       - 8GB

 

Software Requirements:

Operating System                   :  Windows 7/8/10

Server side Script                    :  HTML, CSS, Bootstrap & JS

Programming Language         :  Python

Libraries                                  :  Django, Pandas, Numpy, Tensorflow, Scikit-learn.

IDE/Workbench                      :  VS Code

Technology                             :  Python 3.10

Database                                 :  SQLite

Demo Video

mail-banner
call-banner
contact-banner
Request Video