Adil Khan 11 months ago

AdiKhanOfficial #FYP Ideas

Alpha Vision Auto Object Tracker

Project Title

Project Area of Specialization

Artificial Intelligence

Project Summary

Object tracking in real time is one of the most important topics in the field of Computer Vision. Detection and tracking of moving objects in the video scenes is the first relevant step in the information extraction in many computer vision applications. This idea can be used for the surveillance purpose, video annotation, traffic monitoring, human-computer interaction, intelligent transportation, and robotics and also in the field of medical. In this project, we are tracking object using OpenCV library with state-of-the-art DEEPSORT with YOLOv4.The proposed approach is demonstrated for real-time multiple objects tracking system. Algorithm includes the use of Kalman filters and Linear Assignment theorem in DEEPSORT integrated with the object detector YOLOv4 which works with a single convolutional neural network pre-trained on COCO dataset with the compatibility to be trained for custom objects as well.

Project Objectives

Our aim is to introduce recent advances in visual object tracking as well as motion detection including modeling of environments and shadow removal, different prediction methods, evaluation measures and datasets used for evaluation and comparison of object tracking methods.

We propose to develop a new visual tracking approach based on recurrent convolutional neural networks, which extends the neural network learning and analysis into the spatial and temporal domain. The key motivation behind our method is that tracking failures can often be effectively recovered by learning from historical visual semantics and tracking proposals. In contrast to existing tracking methods based on Kalman filters or related temporal prediction methods, which only consider the location history, our recurrent convolutional model is “doubly deep” in that it examine the history of locations as well as the robust visual features of past frames.

Secondly, this project is for the development of an equipment using computer vision with the help of OpenCV library for auto tracking of object in live video at high fps (frames per second) using NVIDIA Jetson Nano. Research has been carried out for many years in this field, but here in this project tracking will be done with the help of OpenCV library with state-of-the-art, deep Sort, YOLO V4 with Kalman filters algorithms. A camera will be mounted on platform which will be able to move over 2 axes (x co-ordinate and y co-ordinate), tracking the object through dynamic movement of camera with respect to object being tracked over a wide range. The platform will move with servo motors having a feedback control for camera movement stabilization and tracking error reduction. Moreover, a 4-wheel drive (a rover) will move to track the object under consideration maintaining the defined distance from the object.

There is a broad range of applications of object tracking that motivate the interests. Video surveillance is a very popular one. Surveillance systems are not only for recording the observed visual information, but also extracting motion information and, more recently, to analyze suspicious behaviors in the scene. One can visually track airplanes, vehicles, animals, micro-organisms or other moving objects, but detecting and tracking people is of great interest. For instance, vision-based people-counting applications can provide important information for public transport, traffic congestion, tourism, retail, and security tasks. Tracking humans is also an important step for human-computer interaction (HCI). Video can also be processed to obtain the story, to group similar frames into shots, shots into scenes or to retrieve information of interest.

Project Implementation Method

The overview of the procedures is, we choose YOLO (You Only Look Once) to collect rich and robust visual features, as well as preliminary location inferences. YOLO is a research project from University of Washington with a tendency to detect over 80 daily life objects and the compatibility to be trained for custom objects as well with evaluation metrics, processing time and training loss better than many other state-of-the-art object detection algorithms namely LSTM, RCNN, Fast RCNN and others. The proposed model is a convolutional neural network that takes as input raw video frames and returns the coordinates of a bounding box of an object being tracked in each frame.

Conventional RNNs cannot access long-range context due to the back-propagated error either inflating or decaying over time, which is called the vanishing gradient problem. By contrast, Long Short Term Memory (LSTM) RNNs overcome this problem and are able to model self-learned context information. The major innovation of LSTM is its memory cell which essentially acts as an accumulator of the state information. The cell will be accessed, written and cleared by several self-parameterized controlling gates. YOLO comes in the year 2015 developed by Joseph Redmon and Ali Farhadi, this computer vision object detection model had the training period less than RNN, LSTM and FRCNN and was able to get the MAP scores better than them with a better FPS. YOLO can be trained on TINY weights which delivers better FPS but at the cost of accuracy and the original weights deliver good accuracy at the cost of less FPS so it’s a tradeoff, it can be set to the optimum values for both to get the requirements for the specific application i.e. FPS or accuracy.

A 4-wheel drive will be controlled with the algorithm by scaling the output coordinates with a feedback system to keep the object in the centre of the frame and the vehicle will move to maintain the specified distance between the object and the vehicle itself. The vehicle will be able to avoid obstacles with obstacle avoidance feature coded in the microcontroller used to run the 4-wheel drive.

Benefits of the Project

Computers are getting better every day at thinking, analyzing situations and making decisions like humans do. Understanding vision is an integral part of this progress in the area of machine intelligence. One of the things that has baffled scientists and engineers is how to get our algorithms to see things not pixel to pixel but capture the overarching patterns in a picture or a video. Object detection and object tracking technology have come far in that regard and the boundaries are being challenged and pushed as we talk. While detecting objects in an image has been getting a lot of attention from the scientific community, a lesser known and yet an area with widespread applications is tracking objects in a video, something that requires us to merge our knowledge of detecting objects in static images with analyzing temporal information and using it to best predict trajectories.

The cost of video cameras and digital media storage is affordable. In recent years, we have seen a remarkable increase in the amount of video data recorded and stored around the world. In order to process all these video data, there is a growing demand of automatically analyze and understand the video contents. One of the most fundamental processes in understanding video contents is visual object tracking, which is the process of finding the location and dynamic configuration of one or more moving objects in each frame (image) of a video.

The use of Image Processing techniques and its extension in the form of computer vision is being used widely for various applications. The most seen example is tracking of vehicles to monitor traffic and road regulations. A totally different example is the tracking of objects via drones and military aircrafts. Object tracking can be defined as the process of segmenting an object of interest from a video scene and keeping track of its motion, orientation, occlusion etc. in order to extract useful information. Real-time object detection and tracking is a critical task in many computer vision applications such as surveillance, driver assistance, gesture recognition and man machine interface.

Technical Details of Final Deliverable

The vehicle is controlled by the microcontroller (Raspberry Pi) with a separate control system for obstacle avoidance and a control system to control the movement of the vehicle. Movement is controlled by a feedback system which minimizes the error to get the bounding box to the centre of the video frame and maintain A robust 4-wheel drive is designed for this application. A rover model run on high torque stepper motors to be able to carry itself well with the optimum speed required. High speed motors can be considered for film making and broadcasting applications as required which are not considered in the project. A power bank will provide the power to the microcontroller and the driver IC for the stepper motors where large rover wheels are placed that give the vehicle the ability to move not only on smooth but rough surfaces as well (sand, rocks, marbles etc). The camera which is mounted on a pan-tilt mechanism to control the 2-axis movement of the camera which is controlled by the coordinates received to the microcontroller by the output of the software (NVIDIA GPU).

a distance at z-axis from the object which is calculated through the ratio of the object inside the frame. The vehicle will follow around the object being traced (most likely be a person) in the centre of the frame very much like the video shoots for film making or sports broadcasting.

After getting tracked object output from the Graphical Processing Unit (NVIDIA Jetson Nano), a microcontroller (Arduino) will make sure to output the speed of the object being tracked plus the coordinates of the object being locked or tracked.

It will then send the speed of the object as an input to a feedback control system controlling the speed of the stepper motors in a close loop system. A PID controller will be placed before the motors in the forward loop and will control the speed of the motors with respect to the speed of the object locked in the successive frames. Real time stepper motors will ensure proper speed of the RC car following the object. The body of the hardware will consist of two layers, the lower layer will be 6 by 8 inch square with big rover wheels on the sides, the layer will consist of the Raspberry Pi, driver IC and motors to control the vehicle while the upper layer with approximately same dimensions will contain NVIDIA NANO JETSON with the pan-tilt where the camera is mounted and the two batteries to control the two microcontrollers. The coordinates of the object being tracked is transmitted to another feedback control system controlling the position of the servo motors in a close loop system. A PID controller like before will be placed before the motors which will control the position of the servo motors that are connected in the forward loop with pan-tilt camera assembly. It will ensure through a close loop to continuously track the object and simultaneously adjusting the x-y coordinates according to the x-y coordinates in the frame.

Final Deliverable of the Project

HW/SW integrated system

Core Industry

Security

Other Industries

Media

Core Technology

Artificial Intelligence(AI)

Other Technologies

Robotics

Sustainable Development Goals

Industry, Innovation and Infrastructure

Required Resources

Item Name	Type	No. of Units	Per Unit Cost (in Rs)	Total (in Rs)
NVIDEA Nano Jetson	Equipment	1	24000	24000
Raspberry Pi 4	Equipment	1	8000	8000
Servo Motors MG-996	Equipment	2	800	1600
Camera Eken H9R	Equipment	1	10000	10000
2S Lipo Battery 10000mAh	Equipment	1	5000	5000
Chassis with Motors	Equipment	1	9500	9500
Driver IC	Equipment	1	2500	2500
Ultrasonic Sensor HC-SR04	Equipment	1	200	200
DC-DC Booster	Equipment	1	3000	3000
Wires, sheets, power supply	Miscellaneous	1	3000	3000
Transportation	Miscellaneous	1	2000	2000
			Total in (Rs)	68800

If you need this project, please contact me on contact@adikhanofficial.com

121

Comments 0

Smart stress detection

Our project is about detecting stress in real life from the wrist watch with good accuracy...

Adil Khan

11 months ago

VISION A smart cane for visually impaired

According to the reliable estimates of W.H.O, about 40 to 45 million people are blind from...

Adil Khan

11 months ago

Smart Agricultural Drone for controlling crops diseases and analysis w...

With the development of information technology, Internet-of-Things (IOT) and low-altitude...

Adil Khan

11 months ago

Liver cancer prediction

To use data mining techniques to develop a model for predicting the development of liver c...

Adil Khan

11 months ago

Design and implementation of automated pattern design grass cutter

A lawn mower is a machine which is used for cutting grass or lawns, mostly healthy grass w...

Adil Khan

11 months ago