Urdu Natural Language Processing
Natural Language Processing one of the many aspects of Deep Learning. In this era of emerging technologies, the one that is taking over is Deep Learning derived from a broader and vast field known as Machine Learning. In today?s world there are many implemented model and projects of Natural Language
2025-06-28 16:36:31 - Adil Khan
Urdu Natural Language Processing
Project Area of Specialization Artificial IntelligenceProject SummaryNatural Language Processing one of the many aspects of Deep Learning. In this era of emerging technologies, the one that is taking over is Deep Learning derived from a broader and vast field known as Machine Learning. In today’s world there are many implemented model and projects of Natural Language Processing some of the very known are Cortana of Windows, Alexa of Amazon and Google Duplex of Google etc. These are the intelligent assistants that will listen to you and will respond back intelligently and are able to perform the task that you asked with maximum accuracy and precision and will answer back. So, we used this prior knowledge and implemented the same in our National Language Urdu. This was the very first step that we took to develop Artificial Intelligence in Urdu language and develop a system that will interact in Urdu language. We used many algorithms of Deep Learning which includes the modules of sequence to sequence and tensorflow. The system takes typed Urdu text as input and will understand it and will respond according to his intelligence. The system can tell about time, date and weather which is obtained by integrating the Weather API (Application Program Interface), it is also capable of entertaining with some jokes and poetry and off course will be able to carry some conversations with the user. All the steps and modules and procedures are well explained and described in their respective chapters.
Project ObjectivesThe main aim was to take the very first step in the field of Artificial Intelligence with Urdu language and to develop a system that will be implemented in Urdu language. We wanted to develop a system which will interact with people in their language that is Urdu, to develop the very first AI implemented in our national language Urdu.
When we started the research and tried to build a plan to how to carry the project, we came to know that there are three modules that are required to develop a system which can interact with the audience in voice like Cortana, Siri, Alexa etc.
- Voice to Text Module
- Text to Text Module
- Text to Voice Module
Keeping these modules in mind and the available time for the completion of the project we focused on developing the first or the core module that is Text to Text Module, and leave a path for other to follow and carry our work and making it enable with voice modules.
- The main objective was defined to develop the first Artificial Intelligence in Urdu using Deep neural networks and Natural language processing.
- Focusing on the text to text module only. That means after the system is developed, we will have to type all the queries, task or chat in Urdu language using keyboard. And the system in response will generate the output in Urdu text on the screen.
- Each time you want to interact with the system you’ll need to type your question into the system.
- Added some features into the system like our system will be able to tell you about date, time and weather of more than 100 famous cities from famous countries around globe.
- Will also be able to entertain the user with some astonishing poetry and some ridiculous jokes. And off course will be able to answer normally asked questions and will carry on with the user in normal conversations.
As it will still be not useful for people who don’t have knowledge of Computers or can’t read or write, but still it is the initiative towards a new development. As we have laid the foundation stones for the task by developing an AI system limited to text only in Urdu language, it can be continued and can be enabled with voice so that it may achieve its prime objective.
Project Implementation MethodTo carry things on we divided our project into small modules. Each module was dependent on other modules. As in the field of Deep learning one needs a data set to train its system and learn from that data set. In the domain of Natural language processing that data set is known as corpus. The plan of attack was as follows:
- First, create the corpus for the system so that it may learn from it, what this corpus is what was inside it, and why was is the most essential part of our project will be explained in detail in respective chapter.
- Then we need to apply preprocessing on the that corpus, to convert the Urdu text into numeric representation as a computer understands only numbers.
- After the preprocessing task was to develop and tune our Neural Network model using LSTM’s, which will be responsible for the intelligence of the system.
- After this, we added and integrated the weather API with our neural networks so that we may also entertain the weather queries.
- And then the last step was to convert the generated output of the system back to Urdu text so that a human can understand what our system responded.
- Once the system will be ready, then we will develop a platform for the system to run on, like it will be installable on any computer and will be executable like other programs.
As we have designed it just for text, which is one of its limitations but when it is used with voice module, then it may be used in many ways:
- Could be used as personal Assistants.
- Could be used as FAQs for companies’ websites.
- Could be used as telephone operators for complaint registration and service operators for companies like PTCL, SNGPL etc.
- Could be used as booking agents.
Our project can tell about weather, time and date of more the 200 cities of famous countries around the globe. Apart from this following are the services or tasks that our project offer:
- Can tell about weather, time and date of different cities of different countries.
- Can entertain you with some of its jokes.
- Can also entertain you with some astonishing poetry.
- And off course will be able to carry out general conversations with its users.
The first task is achieved with the help of weather API, and the others are achieved with the help of simple programming, like if the trained neural networks detect a question related to jokes or poetry it will just pick a random joke or poetry from the respective database and will throw it onto the screen.
Final Deliverable of the Project Software SystemType of Industry IT Technologies Artificial Intelligence(AI)Sustainable Development Goals Industry, Innovation and InfrastructureRequired Resources| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Total in (Rs) | 75000 | |||
| Jetson TX2 Nvidia AI kit | Equipment | 1 | 67000 | 67000 |
| Urdu Drama Transcripts Printing | Miscellaneous | 10 | 800 | 8000 |