Machine Learning for Early Detection of Phishing URLs in Parked Domains

Project Code :TCMAPY2052

Objective

The objective of this project is to accurately detect and classify phishing URLs in parked domains, categorizing them as either Legitimate Website (Safe) or Phishing Website (Dangerous). By leveraging machine learning algorithms including Support Vector Machine (SVM), K-Nearest Neighbors (KNN), AdaBoost, and Decision Tree, the project aims to develop a robust, scalable system for early phishing detection. The primary goal is to analyze URL patterns and features to identify malicious websites before they cause harm. The project intends to improve cybersecurity measures, safeguarding users and organizations by detecting phishing attempts in real-time, thus preventing data theft, fraud, and identity theft.

Abstract

Phishing attacks have become a significant threat to internet users, with phishing websites targeting sensitive data through deceptive URLs. This project focuses on the early detection of phishing URLs in parked domains using machine learning techniques. The system applies four machine learning algorithms—Support Vector Machine (SVM), K-Nearest Neighbors (KNN), AdaBoost, and Decision Tree—to classify URLs into two categories: Legitimate Website (safe) and Phishing Website (dangerous). By leveraging these algorithms, the system analyzes the characteristics of the URLs and identifies patterns that distinguish phishing sites from legitimate ones. The models are trained and evaluated using Python, with libraries such as scikit-learn for SVM, KNN, AdaBoost, and Decision Tree implementations. This system aims to provide an efficient and scalable solution for early phishing detection, ensuring enhanced cybersecurity for users and organizations. The project highlights the importance of utilizing machine learning algorithms in safeguarding internet traffic and protecting users from malicious online activities. By combining various traditional machine learning techniques, this research demonstrates the potential of automated detection systems for identifying phishing threats in parked domains.

Keywords: Phishing Detection, Machine Learning, SVM, KNN, AdaBoost, Decision Tree, URL Classification, Cybersecurity, Parked Domains, URL Analysis, Phishing Websites, Legitimate Websites, Early Detection, Python.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

SOFTWARE REQUIREMENS

Operating System                               :  Windows 7/8/10

Server side Script                                :  html,css,js

Programming Language                     :  Python

Libraries                                              : Flask, Pandas, Torch, Keras, Sklearn,                                                                                      Numpy , Seaborn

IDE/Workbench                                  :  VSCode

Server Deployment                             :  Xampp Server

Database                                             :  SQLite  

HARDWARE REQUIREMENTS

Processor                                   - I3/Intel Processor

RAM                                       - 8GB (min)

Hard Disk                                - 128 GB

Key Board                               - Standard Windows Keyboard

Mouse                                      - Two or Three Button Mouse

Monitor                                    - Any

Demo Video