Based on text dataset we trying to predict, if the text is AI generated or Human generated text The ultimate goal is to provide whether it is ai generated human.
With the rapid advancement of natural language generation technologies, distinguishing machine-generated text from human-written content has become increasingly challenging yet essential. This project aims to develop a robust, multilingual system capable of accurately identifying machine-generated text across languages, including English, Indonesian, German, and Russian. Utilizing a substantial dataset of 674,083 training samples and 288,894 development samples characterized by attributes such as source, sub-source, language, generation model, label, and text we explore the efficacy of various machine learning and deep learning algorithms.
To achieve reliable classification, the system integrates Random Forest, Long Short-Term Memory (LSTM) networks, Bidirectional Encoder Representations from Transformers (BERT), Decision Tree, and Logistic Regression models. Each model is rigorously evaluated on its ability to handle multilingual data, focusing on both accuracy and computational efficiency. This project combines traditional machine learning with cutting-edge deep learning techniques, contributing a valuable tool for digital content verification by enabling precise differentiation between human-authored and machine-generated text. The proposed system supports a wide range of applications in content validation and enhances trust in digital information across multiple languages and contexts.
Keywords: Multilingual text analysis, Random Forest, Long Short-Term Memory (LSTM) networks, Bidirectional Encoder Representations from Transformers (BERT), Decision Tree, Logistic Regression and Natural Language Processing (NLP).
NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Processor - I3/Intel Processor
Hard Disk - 160GB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
RAM - 8GB
β’ Operating System : Windows 7/8/10
β’ Programming Language : Python
β’ Libraries : Pandas, Numpy, scikit-learn.
β’ IDE/Workbench : Visual Studio Code.