Image Caption Generator with CNN & LSTM

Project Code :TCMAPY193

Objective

From this project, we build an application which automatically generates natural language captions of the image based on the content in the given image using Convolution Neural Network (CNN) for feature extraction and Long Short Term Memory (LSTM) for generating the captions. Which then can be utilized for indexing and searching of images, tagging in social media, helping the visually impaired etc.

Abstract

When we see an image, we can quickly recognize what is going on in the image, what objects are present and what they are doing. With the progress in Artificial Intelligence (AI), we are trying to do the same automatically by our computers. 

The need for such a system is increasing especially due to the advent of autonomous vehicles / semi-autonomous vehicles which involves reading and understanding millions of images. Automatically generating captions for any given image requires the use of Natural Language Processing (NLP) techniques and Neural Networks to classify the images. The ability for a computer to generate captions to an image has various business and individual benefits. This model automatically generates natural language captions which then can be utilized for indexing and searching of images, tagging in social media, helping the visually impaired etc. In this paper, we will build such an application using Convolution Neural Networks (CNN) for feature extraction and Long Short Term Memory (LSTM) for generating the captions.

 

Keywords: Natural Language Processing (NLP), LSTM, CNN, Inception, Xception, Transfer Learning.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

HARDWARE SPECIFICATIONS:

  • Processor: I3/Intel
  • Processor RAM: 4GB (min)
  • Hard Disk: 128 GB
  • Key Board: Standard Windows Keyboard
  • Mouse: Two or Three Button Mouse
  • Monitor: Any

SOFTWARE SPECIFICATIONS:

  • Operating System: Windows 7+
  • Server-side Script: Python 3.6+
  • IDE: PyCharm
  • Libraries Used: Pandas, Numpy, sklearn, Flask, Seaborn, TensorFlow, Keras.

Learning Outcomes

  • Importance of classification.
  • Scope of Xception model.
  • Use of CNN techniques.
  • Importance of PyCharm IDE.
  • Benefits of Xception model.
  • Need of using pre trained model.
  • Process of debugging a code.
  • Input and Output modules
  • How test the project based on user inputs and observe the output
  • Project Development Skills:
    • Problem analyzing skills.
    • Problem solving skills.
    • Creativity and imaginary skills.
    • Programming skills.
    • Deployment.
    • Testing skills.
    • Debugging skills.
    • Project presentation skills.
    • Thesis writing skills.

Demo Video

mail-banner
call-banner
contact-banner
Request Video