Image Caption Generator with CNN & LSTM

Project Code :TCMAPY193

Objective

From this project, we build an application which automatically generates natural language captions of the image based on the content in the given image using Convolution Neural Network (CNN) for feature extraction and Long Short Term Memory (LSTM) for generating the captions. Which then can be utilized for indexing and searching of images, tagging in social media, helping the visually impaired etc.

Abstract

When we see an image, we can quickly recognize what is going on in the image, what objects are present and what they are doing. With the progress in Artificial Intelligence (AI), we are trying to do the same automatically by our computers.

The need for such a system is increasing especially due to the advent of autonomous vehicles / semi-autonomous vehicles which involves reading and understanding millions of images. Automatically generating captions for any given image requires the use of Natural Language Processing (NLP) techniques and Neural Networks to classify the images. The ability for a computer to generate captions to an image has various business and individual benefits. This model automatically generates natural language captions which then can be utilized for indexing and searching of images, tagging in social media, helping the visually impaired etc. In this paper, we will build such an application using Convolution Neural Networks (CNN) for feature extraction and Long Short Term Memory (LSTM) for generating the captions.

Keywords: Natural Language Processing (NLP), LSTM, CNN, Inception, Xception, Transfer Learning.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.