Identification of Spambots and Fake Followers on Social Network via Interpretable AI-Based Machine Learning

Project Code :TCMAPY2019

Objective

The primary objective of this project is to develop a machine learning-based solution for identifying spambots and fake followers on social media platforms. This will be achieved by applying various classification algorithms, including K-Nearest Neighbors, BaggingClassifier, StackingClassifier, and CatBoostClassifier, to the dataset containing features such as user interactions, engagement metrics, and post content. Another important goal is to enhance the transparency of the machine learning models through the use of Partial Dependence Plots (PDP), which will allow users to understand how different features influence the predictions made by the models. The project aims to provide a comprehensive and interpretable solution for classifying fake followers, ensuring that the results are not only accurate but also explainable. Additionally, the project focuses on optimizing the models to handle large-scale datasets efficiently, enabling the system to be scalable for extensive social media networks. Furthermore, it seeks to provide actionable insights into the factors that influence fake account detection, offering valuable information to platform administrators aiming to improve the integrity of their online environments

Abstract

The identification of spambots and fake followers on social media platforms is an essential task to ensure the authenticity of user interactions and maintain the integrity of online environments. Social networks are increasingly affected by the presence of fake accounts that inflate user engagement metrics, spread misinformation, and disrupt meaningful conversations. To address this issue, this project proposes a solution that employs machine learning algorithms combined with interpretable AI techniques to identify and classify spambots and fake followers. The dataset used for training the models contains various features such as user interactions, post content, engagement metrics, and other metadata. The project utilizes K-Nearest Neighbors (KNN), BaggingClassifier, StackingClassifier, and CatBoostClassifier for the task of classification. The interpretability of the models is enhanced using Partial Dependence Plots (PDP), which provide valuable insights into how different features influence the prediction outcomes. Model evaluation is conducted using metrics like accuracy, precision, recall, and F1-score, ensuring that the models perform optimally in distinguishing fake accounts from legitimate users. This project aims to develop an automated solution that aids in the identification of fake followers and spambots, offering insights that enhance platform credibility and user trust. The results of this work can contribute to the development of more transparent and efficient methods for detecting fake activities across various platforms.

Keywords: Spambots, Fake Followers, Machine Learning, Classification, Partial Dependence Plots, CatBoost, BaggingClassifier, K-Nearest Neighbors, StackingClassifier, Web Interface.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

HARDWARE REQUIREMENTS

• Processor - I5/Intel Processor

• RAM - 8GB (min)

• Hard Disk - 160 GB

• Key Board - Standard Windows Keyboard

• Mouse - Two or Three Button Mouse

• Monitor - Any

SOFTWARE REQUIREMENS

• Operating System : Windows 7/8/10

• Server side Script : HTML, CSS, Bootstrap & JS

• Programming Language : Python

• Libraries : Flask, Pandas, Mysql.connector, Os, Numpy,

Scikit-learn.

• IDE/Workbench : VS-Code

• Technology : Python 3.10+

• Server Deployment : Xampp Server

• Database : MySQL

Demo Video

Request Video

Python

Artificial Intelligence
Data Science
Deep Learning
BlockChain

Android

Data Science
Artificial Intelligence

Java

Data Mining
Big Data
Artificial Intelligence
Data Science