Advanced Classification of AI-Generated Images Through Transformers

Project Code :TCPGPY2062

Objective

The primary objectives of this project are as follows: 1. To develop a deep learning-based classifier that can accurately distinguish between real and AI-generated images. 2. To evaluate the performance of three advanced transformer models—EfficientNetV2-S with Cross-Attention, MobileNetV3-Large with Lightweight ECA Attention, and ResNet-50 with SE attention —on the given dataset. 3. To integrate the trained models into a Flask web application, providing a user-friendly interface for accurate image classification. 4. To assess model performance using key metrics such as accuracy, precision, recall, and F1-score, and select the best performing model for classification. 5. To create a reliable and efficient solution that can be used for image verification, synthetic content detection, and other applications that require distinguishing between real and AI-generated images. The project aims to contribute to the growing need for reliable methods in content authentication and verification.

Abstract

This project focuses on the advanced classification of AI-generated images using transformer-driven vision models and enhanced convolutional architectures. With the rapid increase in synthetic visual content, distinguishing generated samples from natural samples has become an important task for secure digital environments, data validation, and trustworthy visual systems. This work uses the CIFAKE dataset, which contains natural images and AI-generated synthetic images, to design a three-model ensemble built on ResNet-50 with SE attention, EfficientNetV2-S with cross-attention, and MobileNetV3-Large with lightweight ECA attention. Attention mechanisms strengthen feature refinement by highlighting informative regions and suppressing weak patterns.

The system is deployed using a Flask-based framework supported by HTML, CSS, and JavaScript for the interface. Modules such as registration, login, classification, and logout allow structured navigation and controlled access. The goal is to create a stable and light-weight image classification pipeline capable of handling diverse samples while maintaining efficient training and inference behavior. The proposed design demonstrates improved representation learning and highlights the strength of attention-based architectures for synthetic image detection.

Keywords: synthetic image detection, attention mechanism, transformers, ResNet-50, EfficientNetV2-S, MobileNetV3-Large, classification, PyTorch, Flask, CIFAKE

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.