Urdu Fake News Detection

Fake news is a common term in today?s world, it spread panic among the people. In our modern era where the internet is ever-present, everyone relies on various online resources for news. Along with the increase in the use of social media platforms like Facebook, Twitter, etc. news spread rapidly amo

2025-06-28 16:29:54 - Adil Khan

Project Title

Urdu Fake News Detection

Project Area of Specialization Artificial IntelligenceProject Summary

Fake news is a common term in today’s world, it spread panic among the people. In our modern era where the internet is ever-present, everyone relies on various online resources for news. Along with the increase in the use of social media platforms like Facebook, Twitter, etc. news spread rapidly among millions of users within a very short span of time. The spread of fake news has far-reaching consequences like the creation of biased opinions to swaying election outcomes for the benefit of certain candidates. Moreover, spammers use appealing news headlines to generate revenue using advertisements via click-baits.

This project aims to perform the classification of various news articles available online with the help of a labeled news data set. This classification process is followed by prepossessing of data by removing less important words, useless characters and stemming. Artificial Intelligence, Natural Language Processing and Machine Learning techniques are applied to classify news. The fake news detection system can classify between fake and real news based on classification algorithms (MLNB, PAC, and SVM). This system scrapes the latest news from authentic news publishing websites and appends that news to the data set. Data set is updated on daily                in windows.

Live searching is a module that uses scraping and string comparison techniques to output the top three related news from all international sources. This system use ‘Google News’ API for this purpose, the claim is passed to ‘Google News’ API, it processes that claim and returns all the articles that match that claim, at last, there is a threshold for citing top three matched articles. Since ‘Google News’ is usually reliable, so this method is more accurate, it involves natural language processing techniques to compare claim and results produced by ‘Google new’

Another technique that is used to detect fake news is stance detection, to produce the stance of a given news headline and body. A neural network is used to train the model for stance detection. The data set contains many news bodies against a single headline to train the machine e.g. this headline and this body are related, unrelated, discussed, unrelated. So, when a user gives input it can easily identify either the news headline and body are related, unrelated, discussed, and unrelated.

Project Objectives

The objective of this project has been to completely and broadly summarize, compare, review and  evaluate the current research on fake news. Fake news authentication software is very useful in the modern day. Our project aims to classify between fake and true news with the help of techniques such as similarity matching and machine learning. Back in the '90s when the news was published on paper mainly at that time piece of news will discover as fake or real was very tough, but now machine learning is providing solutions to tackle the extent not real news news with the help of classification algorithms and a bag of word model. Fake news is now available very easily on social media. News is very important for every country. Fake news plays a very important role in manipulating public opinion. News can affect all aspects of life such as education, technology, politics etc., that’s why the list of benefits is huge We see nowadays fake news’s get viral on social media, people start talking about that fake news without even checking the authenticity of news, so in result, a false narrative is being built against that news in a society. The authenticity of news is very important because if fake news is getting viral onsocial media, and that news is regarding national security so people get panic.

Project Implementation Method

Our news authentication system will be using machine learning and NLPs to verify that the news is real or fake, it gets the news from social media (Facebook) and matches the news with authentic news agencies. Sentiments of specific news will also be produced using python library (Text Blob). Our system will give recommendations with the help of our dataset. The process of this project is that the scrapers scrape data from the authentic news sources and append into our dataset. Then we used an algorithm called cosine similarity that matches the most similar news to the searched query. Another technique we use is to get the searched query data through the “Google News” API. We stored that data to our dataset then apply the cosine similarity algorithm to get the top three highest matched results. The results of each technique are displayed below.

As people are leaning towards social media more than conventional media so the risk of the spread of fake news is higher. Right now, I investigated the phony news issue by checking on existing writing in two stages: portrayal and identification. In the portrayalstage, we presented the essential ideas and standards of phony news in both conventional media and web-based life. The developing issue of false news just makes things increasingly muddled and attempts to change or hamper the conclusion and frame of mind of individuals towards the utilization of computerized innovation. At the point when an individual is beguiled by the genuine news two potential things happen. People begin accepting that their observations about a specific subject are valid as expected. Another issue is that regardless of whether there is any news story accessible which repudiates an apparently phony one, individuals have faith in the words which simply bolster their speculation without taking in the measure the realities in question. While developing this project we face many difficulties and limitations such as the dataset of fake and real news for Pakistan are not available on the internet. However, for international news, the dataset is available. The other thing is that we need to update our dataset frequently because the latest news is uploading to the sources so the scrapers need to get all the news and append to the dataset. So, these scrapers script needs to run very frequently. Our future work includes a Mobile app, a plugin for browser and tacking social media platforms. The future work is difficult and requires more time and skill to add these functionalities. We discussed the future work in detail below.

Benefits of the Project

Application areas for this project are social media users who are active all the time and sharing different posts and news’s, so by this project those users can verify a certain news. A user can enter a news that he wants to verify or he can enter the link of any blog or news site that he has doubt, our algorithm will judge the news based on stance detection and the past dataset on which the algorithm is trained, and after the algorithm end its processing it will give output e.g. how likely this news is true. The project is very important in tackling the rapid spread of news. There is no such system in Pakistan which can tackle spread of fake news. PTA is planning to launch ‘Social Media Monitoring Cell’ to monitor certain pages which are producing content that is against the national security of Pakistan. This project was selected for development because we will use NLP (Natural Language Processing) and machine learning algorithm in this project. Today is the world of machine learning, everything is getting automated or intelligent with the help of AI. So if today’s students want to secure their career in software industry they must have to learn machine learning tools, algorithms and techniques. This project will help us understand the fundamentals of machine learning e.g. supervised learning, unsupervised learning and their algorithms. After completion of this project we will be able to work on any machine learning project, mainly we will have grip on NLP (Natural Language Processing) and its algorithm.

Technical Details of Final Deliverable

The experiments were conducted on aPC equipped with a 4-core Intel Xeon processor clocked at4GHz and NVIDIA Quatro P4000 GPU with 16GB memory Hard disk 1TB with 500Gb graphic card.Python latest version 3.9 use,Workstation.

Final Deliverable of the Project Software SystemCore Industry MediaOther Industries IT Core Technology Artificial Intelligence(AI)Other TechnologiesSustainable Development Goals Peace and Justice Strong InstitutionsRequired Resources
Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
Total in (Rs) 63000
Intel Xeon Processor Equipment16000060000
Working model of project Miscellaneous 310003000

More Posts