The primary objective of this project is to develop a robust system for sentiment and semantic analysis applied to small and medium-sized news headline datasets across the sports, science, and agricultural domains. The project aims to achieve accurate classification of news headlines into predefined categories such as World, Sports, Business, and Sci/Tech, using machine learning models like DistilBERT + LightGBM and SBERT + k-NN.
This study focuses on robust sentiment and semantic analysis applied to small and medium-sized news headline datasets, specifically within the sports, science, and agricultural domains. Given the growing need for effective content classification, the task of classifying news headlines into predefined categories such as World, Sports, Business, and Sci/Tech has gained considerable attention. In this work, two distinct machine learning approaches are explored: a combination of DistilBERT for textual feature extraction with LightGBM for classification and SBERT paired with k-NN for a more fine-grained semantic understanding. Both models are evaluated based on accuracy, with SBERT + k-NN achieving an accuracy of 92.14% and DistilBERT + LightGBM achieving 91.55%. The results indicate that while both models perform effectively, the SBERT + k-NN approach demonstrates slightly better performance, highlighting its potential for nuanced understanding of news headlines across various domains. The findings underscore the importance of leveraging pre-trained transformer models like SBERT for improved sentiment and semantic analysis, especially in the context of smaller datasets where model generalization is critical. This study provides insights into the application of transformer-based models in news classification, contributing to the development of more accurate and scalable systems for domain-specific news analytics.
Keywords: Sentiment Analysis, Semantic Analysis, News Classification, Small Datasets, Sports Domain, Science Domain, Agricultural Domain, DistilBERT, LightGBM, SBERT.
NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Operating System : Windows 7/8/10
Server-side Script : HTML,Css,JS
Programming Language : Python
Libraries : Flask, Pandas, Sklearn,Tensorflow NumPy, Seaborn, Matplotlib
IDE/Workbench : VSCode
Technology : Python 3.8+
Server Deployment : Xampp Server
Database : MySQL .
Processor - I5/Intel Processor
RAM - 8GB +(min)
Hard Disk - 128 +GB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - Any