PDF Malware Detection using specific classes: Toward Machine Learning Modeling With Explainability Analysis

Project Code :TCMAPY1299

Objective

The project develops a machine learning system for detecting malware in PDFs using various algorithms, aiming for high accuracy, interpretability, and real-time threat mitigation.

Abstract

In the digital age, PDF files are widely used for document sharing, but their popularity also makes them a target for malware attacks. This project, titled "PDF Malware Detection: Toward Machine Learning Modeling With Explainability Analysis," aims to develop and evaluate machine learning models for detecting malware in PDF files. Utilizing a dataset from Kaggle, which contains labeled examples of malicious and benign PDFs, various algorithms including Random Forest, C5.0, J48, Support Vector Machine (SVM), AdaBoost, Deep Neural Network (DNN), Gradient Boosting Machine (GBM), and K-Nearest Neighbors (KNN) will be applied. The primary focus is on achieving high detection accuracy while also providing explainability to understand the decision-making process of the models. By leveraging machine learning techniques, this project seeks to enhance cybersecurity measures, offering a robust solution to identify and mitigate potential threats embedded in PDF documents.

 

Keywords: PDF malware detection, machine learning, Random Forest, SVM, DNN, explainability, cybersecurity, malicious PDF, classification algorithms, Kaggle dataset.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

Hardware Requirements:

Operating system                     :  Windows 7 or 7+

RAM                                       :  8 GB

Hard disc or SSD                    :  More than 500 GB  

Processor                                 :  Intel 3rd generation or high or Ryzen with 8 GB Ram


Software Requirements:


Software’s                               :  Python 3.10 or high version

IDE                                         :  Visual Studio Code.

Framework                             :   Flask  

Demo Video