Phishing Attack Detection in Websites Using RF, DT, and CatBoost Based on Feature Selection

Project Code :TCMAPY2360

Objective

The project aims to develop an intelligent system for detecting phishing websites using machine learning models such as Random Forest, Decision Tree, and CatBoost. Feature selection techniques are applied to improve model accuracy and reduce complexity. The system analyzes website attributes to identify fraudulent behavior, helping users avoid cyber threats and enhancing online security.

Abstract

The Phishing Website Detection Using Machine Learning project presents a robust and intelligent solution to combat the growing threat of phishing attacks, which continue to compromise online security worldwide. This system is specifically developed using the Random Forest algorithm to effectively classify websites as legitimate or malicious. The approach focuses on extracting and analyzing a wide range of URL-based features that help distinguish phishing websites from genuine ones. Key indicators such as domain age, URL length, presence of an IP address, HTTPS usage, redirection behavior, prefix-suffix patterns, URL depth, and detection of URL shortening services are systematically evaluated. These features are extracted using techniques including WHOIS lookups, regular expression matching, and HTML/JavaScript content analysis, enabling the detection of subtle anomalies commonly used by attackers. Suspicious patterns such as unusually long URLs, inclusion of special symbols like “@” or “-”, and excessive redirections are effectively identified.

The Random Forest model, an ensemble learning technique based on multiple s, is employed for classification due to its high accuracy, robustness, and ability to handle complex datasets. By aggregating the predictions of multiple trees, the model minimizes overfitting and improves overall performance. The trained model achieves an impressive accuracy of 99.8%, making it highly reliable for real-time phishing detection.

The system is implemented as a Flask-based web application integrated with MySQL for secure user management and activity logging. Users can analyze individual URLs or upload datasets for batch processing, receiving clear and actionable results. With its scalability, high accuracy, and user-friendly interface, the project enhances cybersecurity, reduces risks of data theft and financial fraud, and promotes safer internet usage.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.