Emotion Recognition using Speech Processing

Project Code :TCMAPY1122

Objective

Clear research objectives are vital for guiding the direction of the study and ensuring focused investigation into specific areas of interest. In the context of this dissertation, the research objectives are designed to address the overarching, goal of enhancing speech emotion recognition using Convolutional Neural Networks (CNN) and Long Short- Term Memory (LSTM) architectures. Firstly, the primary objective is to develop and create a robust speech emotion recognition system capable of accurately classifying human emotions conveyed through speech signals. This involves designing and training CNN and LSTM models to effectively extract features from speech data and classify them into distinct emotional categories. Secondly, the research aims to evaluate the performance of the CNN and LSTM models in comparison to traditional approaches to speech emotion recognition. The suggested models' accuracy, efficiency, and robustness in capturing the subtle differences in human emotions would be achieved by extensive experimentation utilizing a variety of datasets.

Abstract

This project delves into advancing Emotion Recognition in Speech through the utilization of Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) models. By integrating these advanced techniques into speech processing, we aim to enhance the interpretation of human emotions from speech signals. Our objective is to refine human-computer interaction, with potential applications spanning mental health support and customer service analytics. The system harnesses CNN and LSTM architectures to extract features such as Mel-frequency cepstral coefficients (MFCC) and Chromogram for the precise classification of speech into emotional categories. Key stages of the project include data collection, feature extraction, emotion classification, model evaluation, real-time implementation, and user interface development. Challenges encompass navigating the nuanced nature of human emotions, accommodating speech pattern variations, and ensuring real-time processing capabilities. Through this endeavor, we endeavor to significantly contribute to the evolution of speech emotion recognition systems.

KEYWORDS: Emotion Recognition, Speech Processing, Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), Mel-frequency cepstral coefficients (MFCC), Chromogram, Human-Computer Interaction, Mental Health Support, Customer Service Analytics, Feature Extraction, Model Evaluation, Real-Time Implementation, User Interface Development, Human Emotions, Speech Pattern Variations, Real-Time Processing.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

H/W Specifications:

Processor     :  I5/Intel Processor

RAM            :  8GB (min)

Hard Disk     :  128 GB


S/W Specifications:

Operating System           :   Windows 10

Server-side Script           :   Python 3.6

IDE                        : PyCharm, Jupyter notebook

Libraries Used        :   Numpy, IO, OS, Flask, Keras, pandas, tensorflow


Demo Video