Adaptive Data Augmentation Techniques for Software Requirements Classification Using Deep Learning

Project Code :TCMAPY2455

Objective

Software requirements classification suffers from class imbalance, degrading minority class performance. This paper proposes adaptive augmentation using TinyLlama 1.1B with few?shot prompting to generate synthetic requirements, balancing 13 types. Three models—BERT, GPT?2, and a RoBERTa+BiGRU hybrid—are fine?tuned and evaluated using accuracy, macro F1, and confusion matrices. Explainability is provided via Integrated Gradients and SHAP. The best model is deployed as a Flask web application for real?time classification. Results show significant improvement across minority classes.

Abstract

Software requirements classification is a critical yet time-consuming task in requirements engineering. Existing datasets suffer from severe class imbalance, degrading classifier performance on minority requirement types. This paper proposes an adaptive data augmentation framework that leverages TinyLlama 1.1B, a lightweight large language model, to generate synthetic requirements for underrepresented classes using few-shot prompting, balancing all 13 requirement types to a uniform target size. Three deep learning models are fine‑tuned and evaluated on the augmented dataset: BERT (bert‑base‑uncased), GPT‑2, and a novel hybrid RoBERTa + BiGRU architecture that combines RoBERTa contextual embeddings with a two‑layer bidirectional GRU and masked mean pooling for enhanced sequential representation. All models are assessed using accuracy, macro F1‑score, per‑class precision and recall, and confusion matrices. Explainability is provided through Layer Integrated Gradients for BERT and SHAP force plots for GPT‑2 and RoBERTa + BiGRU, making predictions interpretable. The best‑performing model is deployed as a local Flask web application enabling real‑time requirement classification. Experimental results demonstrate that adaptive augmentation significantly improves classification performance across all minority requirement categories.

Keywords: software requirements classification, data augmentation, TinyLlama, BERT, GPT‑2, RoBERTa, bidirectional GRU, natural language processing, deep learning, SHAP, explainability, requirements engineering

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

1.     SOFTWARE REQUIREMENS

Operating System                               :  Windows 7/8/10

Server-side Script                               :  HTML, CSS, Bootstrap & JS

Programming Language                     :  Python

Libraries                                             : Flask, Pandas, Sklearn,Pytorch,                                                                                             NumPy, Seaborn, Matplotlib,pillow, Torch

                                                                Transformer, Torch , Shap

IDE/Workbench                                  :  VSCode

Technology                                         :  Python 3.8+

Server Deployment                             :  Xampp Server

Database                                             :  MySQL    

 

2.     HARDWARE REQUIREMENTS

Processor                                 - I5/Intel Processor

RAM                                       - 8GB+ (min)

Hard Disk                                - 128 GB+

Key Board                               - Standard Windows Keyboard

Mouse                                      - Two or Three Button Mouse

Monitor                                    - Any

Demo Video

mail-banner
call-banner
contact-banner
Request Video