The objective of this project is to develop an advanced text classification system that automatically categorizes news articles into predefined categories. This will be achieved by implementing a range of machine learning and deep learning algorithms, including Logistic Regression, Naive Bayes, BERT, RoBERTa, and DistilBERT. The aim is to leverage these algorithms to analyze news content and predict its category with high accuracy, making it easier for users to quickly access relevant news based on their interests. The system will combine the strengths of both traditional machine learning models and state-of-the-art transformer models to optimize classification performance. This project seeks to provide a scalable, efficient, and automated solution for real-time news categorization.
Text classification is a crucial task in natural language processing (NLP) that aims to assign predefined categories to text based on its content. This study focuses on classifying news articles into specific categories by leveraging various machine learning and deep learning algorithms, including Logistic Regression, Naive Bayes, BERT, RoBERTa, and DistilBERT. These algorithms are applied to identify the relevant category for each news article, based on its textual features. The project employs traditional machine learning techniques such as Logistic Regression and Naive Bayes alongside advanced transformer-based models like BERT, RoBERTa, and DistilBERT to capture semantic nuances in the text and improve classification accuracy. The model training and evaluation are carried out using Python, with libraries such as scikit-learn for traditional models and Hugging Faceβs Transformers for deep learning-based approaches. The system aims to provide an efficient and scalable solution for automatic news categorization, which can be extended to various domains like sentiment analysis, topic modeling, and content recommendation. By incorporating both traditional and state-of-the-art techniques, the project highlights the evolving landscape of text classification and its practical applications in real-time news categorization.
Keywords: Text Classification, Logistic Regression, Naive Bayes, BERT, RoBERTa, DistilBERT, News Categorization, Machine Learning, Deep Learning, Natural Language Processing (NLP), Sentiment Analysis, Topic Modeling, Hugging Face, Python.
NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

SOFTWARE REQUIREMENS
Operating System : Windows 7/8/10
Server side Script : html,css,js
Programming Language : Python
Libraries : Django, Pandas, Torch, Keras, Sklearn, Numpy , Seaborn
IDE/Workbench : VSCode
Server Deployment : Xampp Server
Database : SQLite
HARDWARE REQUIREMENTS
Processor - I3/Intel Processor
RAM - 8GB (min)
Hard Disk - 128 GB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - Any