Investigating Evasive Techniques in SMS Spam Filtering A Comparative Analysis of Machine Learning Models

Project Code :TCMAPY1259

Objective

The primary objective of this project is to evaluate and compare the effectiveness of various machine learning models, specifically Decision Tree, Random Forest, Roberta, and Distil Roberta, in detecting SMS spam. Utilizing a dataset from Kaggle, the study aims to identify and analyze the best performing models in terms of accuracy, precision, and recall. The project focuses on assessing each model's ability to effectively distinguish between legitimate ('ham') and unsolicited ('spam') messages, while also investigating their resilience against evasion techniques employed by spammers. The ultimate goal is to enhance understanding of model adaptability in the face of evolving spam tactics in SMS communications.

Abstract

This study addresses the challenges of SMS spam detection by analysing and comparing the effectiveness of various machine learning models on a Kaggle-sourced dataset containing both spam and non-spam messages. We employed Decision Tree, Random Forest, Roberta, and Distil Roberta algorithms to discern patterns and distinguish between legitimate and unsolicited messages. Each model was evaluated based on its accuracy, precision, and recall in identifying spam. The data consisted of labelled messages, with 'ham' indicating non-spam and 'spam' for unsolicited content. The objective was to identify which models performed best in terms of detection rates and resilience against evasion techniques used by spammers. Our analysis aimed to provide insights into the adaptability of advanced machine learning models in combating the evolving tactics of spam in SMS communications.

Keywords: Decision Tree, Random Forest, Roberta, and Distil Roberta.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

H/W SPECIFICATIONS:


β€’    Processor        : I5/Intel Processor
β€’    RAM                 : 8GB (min)
β€’    Hard Disk        : 128 GB
β€’    Key Board       : Standard Windows Keyboard
β€’    Mouse             : Two or Three Button Mouse

β€’    Monitor           : Any


S/W SPECIFICATIONS:


β€’    Operating System        : Windows 7+        
β€’    Server-side Script        : Python 3.6+
β€’    IDE                                 : PyCharm.
β€’    Libraries Used             : Pandas, Numpy, Matplotlib, OS.

Demo Video

mail-banner
call-banner
contact-banner
Request Video