Violence detection on Surveillance using Convolutional Neural Networks

Violence is intolerable in any society. Typically, surveillance cameras are applied at different spots within a certain area to detect and identify the violence. The issue in that typical surveillance camera systems is that there should be a person monitoring it all the time for any violence. This c

2025-06-28 16:29:56 - Adil Khan

Project Title

Violence detection on Surveillance using Convolutional Neural Networks

Project Area of Specialization Artificial IntelligenceProject Summary

Violence is intolerable in any society. Typically, surveillance cameras are applied at different spots within a certain area to detect and identify the violence. The issue in that typical surveillance camera systems is that there should be a person monitoring it all the time for any violence. This cause huge chances of human error because a person cannot focus at one thing for a long time and it require huge human resources. To resolve this issue, we are proposing Violence Detection System for Surveillance using Convolutional Neural Networks. Our system will work automatically to detect violence captured in any of the surveillance cameras. For implementing this system, we will be using the newly developed deep learning technique CNN and some pretrained models. We can detect violence using edge detections in CNN. The pixel which is moving so fast will have higher intensity on histogram that will tell us about the violence in the surrounding in real time. Previously, this type of work has been done by using different datasets for a single purpose. Unlike others, we will not only detect simple violence but also detect pistol, knife and bare hand fight using three classes. We will use RWF-2000 dataset and others weapon detection datasets. We will develop a desktop application for all the operations including detecting violence and saving frames. After detecting violence, the system will generate an alert and display it via a pop-up box. We hope this project will play a great role in the betterment of the society.

Project Objectives

Our aim is to develop an end-to-end system for our project which would take video data as an input from CCTV camera, detect the violence (if any) in that data via a model (which is trained with deep learning techniques), and notify/inform the concerned person if the violence has been detected.

1. Developing a Model

Our first objective is to develop a model which will be able to classify the input data into one of four classes:

  1. Violence NOT detected
  2. Violence detected without involving weapons.
  3. Violence detected involving knives.
  4. Violence detected involving guns.

For achieving this objective, we will have to do follwing things:

1. Obtaining an existing Model

2. Altering the Dataset used in existing model

3. Retraining the model

2. Developing Desktop Application

Our second objective includes developing a desktop application. We will be collecting real-time video data from a CCTV camera and send that data to the host PC where our system is running. The system will be in the form of a desktop application. Our system will then process that video data, and then use our trained model to detect the violence.

Project Implementation Method

1. Developing a Model

There are three steps involved in developing the required model for our project.

1.1.  Obtaining a Model

Upon researching the web, we found out a suitable deep learning model for detecting violence. That model is utilizing deep learning techniques such as 3D-CNN and optical flow. The model is trained on RWF-2000 video database, which is an open large-scale video database for violence detection. The videos in that dataset does not contain videos with weapon involvement. And the existing model is classifying videos into just two classes:

  1. Violence detected
  2. Violence NOT detected

For our system, we will utilize and extend the functionality of that pretrained model to take advantage of its highest accuracy. Also, we will have to alter the dataset by including the videos involving weapons.

1.2.   Altering the Dataset

In training part of dataset, there are 1600 videos (800 Violence and 800 Non-Violence), and testing part contains 400 videos. We will replace existing 200 violence videos from its training part of dataset with our videos involving weapons. Among those 200 videos, 100 videos will involve knives, and other 100 videos will involve guns. weapons.

1.3.   Retraining the model

For utilizing the existing model, we will give the altered dataset and retrain that model. As expected, that model will classify the videos in two classes, Violence detected, and Violence not detected. Now, after the videos have been classified, we will further add layers into that model which will take those videos as input which were classified as Violent videos, then will further classify them into three classes, Violence without weapon, Violence with knives, and Violence with guns. For that purpose, we will be using CNN technique.

2. Developing the Desktop Application

After developing the model, we will be developing a desktop application. We will be collecting real-time video data from a CCTV camera and send that data to the host PC where our system is running. The system will be in the form of a desktop application. Our system will then process that video data, and then use our trained model to detect the violence.

Benefits of the Project

Once our project is completed, it can help organizations detect violence within their premises. They won't need to hire large number of employees in their security department just to stare at and monitor the CCTV cameras' footage. Once the project is implemented, they would need just few employees.

The system can easily detect where the violence has occurred along with the severity of that violance. The people monitoring the live footage on PCs will be alerted via pop-up windows. Also, the concerened people will be notified via text message on their cellphones. In this way, proper action can be taken in real-time and security guards would be sent to end the violence (if needed).

Then, the organization can also check later where the violence had occurred along with the video data which would only include the violence part of footage instead of long running clips, thus eliminating the need of fast-forwarding the clip.

Technical Details of Final Deliverable

We will be collecting real-time video data from a CCTV camera and send that data to the host PC where our system is running. The system will be in the form of a desktop application. Our system will then process that video data, and then use our trained model to detect the violence.

  1. If no violence is detected in the data, the system will discard the data and will work on next coming data.
  2. If there is any violence detected, the system will look for one of these types of violence:
    1. Violence without involving any weapon.
    2. Violence involving knives.
    3. Violence involving guns.

Once the nature of violence is identified, the system will inform/notify the concerned person by displaying a pop-up window on screen. An SMS alert will be sent to the cellphone of concerned person. Also, the specific part of video data (frames), in which violence activity is detected, will also be saved for future reference.

Final Deliverable of the Project Software SystemCore Industry SecurityOther Industries IT Core Technology Artificial Intelligence(AI)Other TechnologiesSustainable Development Goals Peace and Justice Strong InstitutionsRequired Resources
Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
Total in (Rs) 8200
Wireless HD CCTV Camera Equipment150005000
Google Colab Pro Subscription Equipment216003200

More Posts