Detecting Deepfake Audio Using  Spectrogram-Based Machine  Learning Approaches

Project Code :TCMAPY2012

Objective

The primary objective of this project is to develop a robust and efficient system for detecting deepfake audio using spectrogram-based machine learning approaches. By transforming audio samples into mel-spectrograms, the system captures rich time-frequency patterns essential for distinguishing synthetic speech from genuine recordings. The project leverages ensemble learning models—MobileNetV2, DenseNet121, and EfficientNetB0—to classify audio inputs based on their spectrogram features. Trained on the Deep Voice dataset containing both real and deepfake audio, the goal is to achieve high accuracy in detecting manipulations, thereby enhancing audio forensics, strengthening digital security, and mitigating the risks posed by AI-generated synthetic voices. 

Abstract

The rapid advancement of deep learning technologies has led to the creation of highly convincing synthetic speech, raising significant concerns in areas such as digital security, media authenticity, and privacy. In response to this emerging threat, this work proposes a spectrogram-based machine learning framework for detecting deepfake audio. The approach involves converting audio signals into mel-spectrogram representations to capture essential time-frequency features, which are then used as input to three ensemble learning classifiers: MobileNetV2, DenseNet121, and EfficientNetB0. These models are trained and evaluated on the Deep Voice dataset, which comprises both real and synthetic voice samples. Experimental results demonstrate the effectiveness of the proposed method in distinguishing deepfake audio with high accuracy. This study underscores the potential of leveraging visual feature extraction techniques in combination with deep learning models for enhancing audio-based forgery detection systems.

Keywords: Deepfake Audio, Mel-Spectrogram, Audio Forensics, MobileNetV2, DenseNet121, EfficientNetB0, Deep Voice Dataset, Synthetic Speech Detection, Machine Learning, Ensemble Learning.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

Hardware Requirements

Processor                                 - I3/Intel Processor

Hard Disk                                - 160GB

Key Board                              - Standard Windows Keyboard

Mouse                                     - Two or Three Button Mouse

Monitor                                   - SVGA

RAM                                       - 8GB

 

Software Requirements:

Operating System                   :  Windows 7/8/10

Server side Script                    :  HTML, CSS, Bootstrap & JS

Programming Language         :  Python

Libraries                                  :  Django, Pandas, Numpy, Tensorflow, Scikit-learn.

IDE/Workbench                      :  VS Code

Technology                             :  Python 3.10

Database                                 :  SQLite

Demo Video

mail-banner
call-banner
contact-banner
Request Video