Shopping Assistant For Visually Impaired Community

2025-06-28 16:29:04 - Adil Khan

Project Title

Project Area of Specialization Artificial IntelligenceProject Summary

Millions of people live in this world with disabilities to understand the environment due to visual impairment. Although they can develop alternative approaches to deal with daily routines, they suffer from certain navigation difficulties as well as social awkwardness. For example, it is very difficult for them to find a particular thing in an unfamiliar environment. They have to depend upon others to fulfill the basic needs of their lives. Shopping Assistant for Visually Impaired Community aims to help visually impaired people in detecting and identifying objects when they are in a new place. This system uses a mobile application that captures the scene by the mobile camera in real-time and detects the things and obstacles which are in front of the camera and uses a conversational agent which guides the person suffering from visual impairment to inform them about objects, their details and enables them to pick required things by using voice when they go for shopping or even in their homes. The system captures real-time video with a mobile camera. In the next step, the system does image recognition to detect and identify objects and determine the distance or position of an object, by deep learning algorithms like YOLO. The text on the object can also be detected by using OCR methods. Finally, the objects and required details are informed to the visually impaired person by using voice.

Project Objectives

Millions of people live in this world with disabilities to understand the environment due to visual impairment. They have to depend upon others for daily life tasks while suffering from social awkwardness. This project aims to help visually impaired people in detecting and identifying objects when they are in a new place. This system uses a mobile application that captures the scene by the mobile camera and detects the things and obstacles in front of the camera and recognizes the text on the object. Then it uses speech guidance to help inform a person about the objects, and their details and enables them to pick required things by using voice when they go shopping or even in their homes.

Project Implementation Method

The app design & implementation constraints are divided into the following steps:

Input: In this step, a camera is used to capture the images/video of objects in the indoor environment. The video is converted into frames by using the object detection model YOLO. In a mobile application, YOLO can detect 20-25 FPS with a prediction time of only 3 sec.
Preliminary processing techniques: This step focuses on the removal/elimination of background noise, enhancing contrast, and binarization of images. Object detection: In this step, various object detection algorithms like YOLO can be used to detect the objects (around 15-20 which includes furniture, electronics, fruits, etc. objects of daily life usage) in an image.
Object recognition: It is a computer vision technology to identify objects in real-time using a machine learning algorithm YOLO. The proposed system is used to identify different categories of objects. The Boundary boxes are created over the Object and the Result is shown.
Text recognition: In this step, the text on the objects is detected and recognized using OCR.
Speech conversion: This is the final step in the guiding system development. The objects which were detected using the YOLO algorithm were then converted to speech using GTTS (Google Text-to-Speech), a Python library and a Command Line Interface tool to interface with Google Translates text-to-speech API. The generated text is converted into speech to assist the visually impaired people through the B-learner (Mobile App).

Benefits of the Project

The system should be able to help visually impaired people by helping them to navigate in shopping activities by accurately detecting the objects and text on them and then should be able to assist the person by telling about the object in the form of a voice in a real-time environment.

The concept of machine-learning methods for assisting the visually impaired community is not new. However, several types of studies were carried out in this field, aiming to figure out the accuracy of different methods. The app will provide accurate and efficient results in seconds that differ from traditional software.

Technical Details of Final Deliverable

Hardware Interfaces

Android smartphone
Android version above 5.0

Software Interfaces

Frontend software: Android
Backend software: Java
Database software: MySQL/ Firebase

The system is built-in Java programming language for the backend and MySQL/Firebase is used for storing data in the database. The objects which were detected using the YOLO algorithm were then converted to speech using GTTS (Google Text-to-Speech), a Python library and a Command Line Interface tool to interface with Google Translates text-to-speech API. The generated text is converted into speech to assist the visually impaired people through the B-learner (Mobile App).

Communications Interfaces

The system is running on the application layer. Simply users install the app and run it. For communication with end-user network layer used. The application is under full control of the server such as object detection, text recognition, and adding new objects are sent to the server when a user attempts to use the applications.

Final Deliverable of the Project Software SystemCore Industry MedicalOther IndustriesCore Technology Artificial Intelligence(AI)Other TechnologiesSustainable Development Goals Good Health and Well-Being for PeopleRequired Resources

Item Name	Type	No. of Units	Per Unit Cost (in Rs)	Total (in Rs)
			Total in (Rs)	1000
Printing	Miscellaneous	1	1000	1000

Shopping Assistant For Visually Impaired Community

More Posts