Advanced Detection of Violence from Video Performance Evaluation of Transformer and State of the Art of Convolution of Neural Network Transformer

Project Code :TCMAPY1871

Objective

The objective of this project is to develop and evaluate deep learning models for real-time detection of violent incidents in video surveillance systems. By leveraging advanced models such as YOLOv8, MobileNetV3, and Transformer, the project aims to identify which model provides the highest accuracy in detecting violent actions within dynamic video environments. The models will be trained on publicly available video frames, with a focus on optimizing their parameters for improved performance. The goal is to enhance the efficiency and reliability of surveillance systems, enabling faster intervention and contributing to safer public spaces through accurate violence detection.

Abstract

The increasing need for safety in public and private spaces has led to the development of advanced surveillance systems capable of detecting violent incidents in real time. As violent events can escalate quickly, it is critical for trained professionals to intervene at the earliest detection point. Traditional surveillance systems rely on basic video monitoring, but advancements in machine learning and deep learning models have significantly enhanced the capabilities of these systems. This study focuses on evaluating four deep learning models—YOLOv8, MobileNetV3, and Transformer—using publicly available video frames to detect violent actions. The models were trained with optimized parameters, and their performance was rigorously assessed based on accuracy and detection capability. The Transformer model was found to outperform the other models, achieving the highest accuracy in identifying violent incidents. This suggests that Transformer-based models can provide superior performance for real-time violence detection in dynamic video environments. YOLOv8 and MobileNetV3, while effective, did not achieve the same level of accuracy, highlighting the potential of Transformer models for future surveillance applications. These findings emphasize the importance of selecting the right model for effective violence detection and contribute to the ongoing evolution of AI-powered surveillance systems that can detect and respond to violent situations swiftly and accurately.

Keywords: Violence detection, video surveillance, deep learning, YOLOv8, MobileNetV3, Transformer, machine learning, Non-Violence detection.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

Hardware Requirements

Processor                                 - I3/Intel Processor

Hard Disk                                - 160GB

Key Board                              - Standard Windows Keyboard

Mouse                                     - Two or Three Button Mouse

Monitor                                   - SVGA

RAM                                       - 8GB

 

Software Requirements:

Operating System                   :  Windows 7/8/10

Server side Script                    :  HTML, CSS, Bootstrap & JS

Programming Language         :  Python

Libraries                                  :  Django, Pandas, Numpy, Tensorflow, Scikit-learn.

IDE/Workbench                      :  VS Code

Technology                             :  Python 3.10

Database                                 :  SQLite

Demo Video