Developing Audio Narratives from Visual images for the Visually Impaired

Project Code :TMMAAI358

Objective

The objective of this project is to develop a voice-integrated image captioning system that combines CNN and LSTM techniques, enhancing accessibility for visually impaired individuals through intuitive voice-based interaction.

Abstract

The proposed model integrates voice commands with image processing and caption generation techniques to enhance accessibility, particularly for visually impaired individuals. The system begins by using a microphone input, where the user’s voice is processed to select an image from a predefined folder. Once the image is selected, a combination of Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks is employed to generate a descriptive caption for the image. CNN is used for extracting meaningful features from the image, while LSTM is responsible for generating a coherent and contextually accurate caption based on these features. After the caption is generated, the system converts the text into voice output, allowing users to receive the image description audibly. This voice-based image caption generation system provides an intuitive interface for users who may not be able to visually interpret images, enabling them to navigate and understand visual content through voice. The integration of voice input for image selection and voice output for caption delivery makes the system particularly user-friendly and accessible, promoting the use of image processing in assistive technologies. This model demonstrates how combining image processing with Natural Language Processing (NLP) can create impactful solutions for real-world accessibility challenges.

Keywords: NLP (natural language processing), CNN (Convolutional neural network), LSTM (Long short-term memory), RNN (recurrent neural network)

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

Software: Matlab 2020a or above

Hardware:

Operating Systems:

Windows 10
Windows 7 Service Pack 1
Windows Server 2019
Windows Server 2016

Processors:

Minimum: Any Intel or AMD x86-64 processor

Recommended: Any Intel or AMD x86-64 processor with four logical cores and AVX2 instruction set support

Disk:

Minimum: 2.9 GB of HDD space for MATLAB only, 5-8 GB for a typical installation

Recommended: An SSD is recommended A full installation of all MathWorks products may take up to 29 GB of disk space

RAM:

Minimum: 4 GB

Recommended: 8 GB

Learning Outcomes

· Introduction to Matlab

· What is EISPACK & LINPACK

· How to start with MATLAB

· About Matlab language

· Matlab coding skills

· About tools & libraries

· Application Program Interface in Matlab

· About Matlab desktop

· How to use Matlab editor to create M-Files

· Features of Matlab

· Basics on Matlab

· What is an Image/pixel?

· About image formats

· Introduction to Image Processing

· How digital image is formed

· Importing the image via image acquisition tools

· Analyzing and manipulation of image.

· Phases of image processing:

o Acquisition

o Image enhancement

o Image restoration

o Color image processing

o Image compression

o Morphological processing

o Segmentation etc.,

· How to extend our work to another real time applications

· Project development Skills

o Problem analyzing skills

o Problem solving skills

o Creativity and imaginary skills

o Programming skills

o Deployment

o Testing skills

o Debugging skills

o Project presentation skills

o Thesis writing skills

Demo Video

Request Video

Artificial Intelligence

Artificial Neural Network
Deep Learning

Image Processing

Graphical User Interface(GUI)

Interfacing

Android
VLSI
PHP
Embedded