The main objective of this project is to develop an automated spam detection system that accurately classifies emails as either spam or ham using deep learning algorithms. To achieve this, the project will involve preprocessing the dataset to clean and tokenize the email content. Word2Vec will be used to generate meaningful word embeddings that capture semantic relationships, which will be fed into a BiLSTM model to understand the context of the text in both forward and backward directions. Additionally, XLNet will be integrated to capture long-range dependencies and improve the model's ability to process complex email content. The system's performance will be evaluated using metrics such as accuracy, precision, recall, and F1 score to ensure it performs effectively on new, unseen data. Furthermore, a Flask-based web application will be developed, allowing users to register, log in, and upload emails for classification, providing an easy-to-use interface for email spam filtering. The overall goal is to create a reliable, efficient system for automated spam detection, offering a robust solution for email classification
Spam detection in email messages is a critical task in maintaining the effectiveness of email communication. This project leverages deep learning techniques, specifically Word2Vec combined with BiLSTM (Bidirectional Long Short-Term Memory) and XLNet, to develop a system that accurately classifies emails as spam or non-spam (ham). Word2Vec generates word embeddings that capture semantic information from the text, which are then processed using BiLSTM for context understanding in both directions. XLNet, a transformer-based model, enhances this by capturing dependencies and relationships across the entire email. The system is trained on a dataset of 190K labeled emails, enabling it to generalize well for various email messages. This model offers an efficient and reliable approach to email classification, improving spam detection accuracy. The application of these advanced algorithms ensures an intelligent, automated classification of email messages, providing users with a powerful tool for filtering unwanted messages.
Keywords: Spam detection, email classification, Word2Vec, BiLSTM, XLNet, machine learning, deep learning, semantic embeddings, transformer, email filtering.
NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Hardware Requirements
Hard Disk - 160GB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
RAM - 8GB
Software Requirements:
Operating System : Windows 7/8/10
Server side Script : HTML, CSS, Bootstrap & JS
Programming Language : Python
Libraries : Flask/Django, Pandas, Mysql.connector, Os, Smtplib, Numpy
IDE/Workbench : PyCharm
Technology : Python 3.6+
Server Deployment : Xampp Server
Database : MySQL