PSAT Preprocessing and Sentimental Analysis Tool
In this project, we have developed a preprocessing and Sentiment analysis (SA) tool (PSAT) that will be used to analyze the sentiments of peoples, cricket audience, former cricketers, and other sports personality on abandoned tour of Pakistan by New Zealand reasoning security concerns. This paper fo
2025-06-28 16:28:52 - Adil Khan
PSAT Preprocessing and Sentimental Analysis Tool
Project Area of Specialization Computer ScienceProject SummaryIn this project, we have developed a preprocessing and Sentiment analysis (SA) tool (PSAT) that will be used to analyze the sentiments of peoples, cricket audience, former cricketers, and other sports personality on abandoned tour of Pakistan by New Zealand reasoning security concerns. This paper focuses on data cleaning and analyzing the sentiments from textual data. The data is collected from different social websites, Facebook, and tweeter. In which the user can choose a topic and specify their preferences. The model uses recent linked tweets to detect the polarity (negative, positive, both and neutral) of the issue and displays the findings. Around 3000 Arabic tweets were randomly selected and evenly labelled to train the programmed. In this research, we offer a novel technique that uses a combination of parameters to apply sentiment analysis of cricket text tweets and comments. Those parameters are the time of the tweets, preprocessing methods like stemming and retweets, removing whitespaces, Capitalizing. The PSAT tool combined with Naive Bayes classifier a group of classification algorithms based on Bayes’ Theorem. The Accuracy PSAT tool is 75% approx. and F1 Score is 69%. According to our experiment, The Naive Bayes machine learning approach is the most accurate at predicting topic polarity. The tool is excellent for intermediate and advanced users, and it can assist them in determining the ideal parameter combinations for sentiment analysis.
Project ObjectivesThe objective of PSAT Preprocessing and Sentiment Analysis Tool is to first preprocess or cleaned the Snapped Data of Sentiments of Cricket Analyst, former cricketers, any other sportsperson and audience and then analyzed their sentiments in terms of positive, negative and neutral reviews using Naïve Bayes. This tool is developed in C#.
Project Implementation MethodThe content associated to the impact of cancelation cricket tour and evaluated in this study to obtain user sentiments from social media. Figure 1 shows the flow and significant steps of the proposed method. The study also trains Nave Bayes machine learning classifier to identify content based on the sentiment of the user based on the recent event of cancellation of New Zealand tour of Pakistan in 2021 One Day International. So, research will be to acquire the sentiments of peoples across the globe that what they think about cancellation of this tour. After collecting the data through scraping tool, we have developed the data cleaning tool PSAT using C# .net technology which will clean the data and preprocessing it and giving the results as positive view, negative view and neutral in the shape of graph and resulted data. In this PSAT, we will input the excel file and then PSAT will load the excel file then there will be options of cleaning the data as removing stop-words, removing whitespaces, Capitalizing first letter and more etc. Using these options one can preprocessed the data according to the need then the data will be further processed in which graphs of the result will be generated that what are the sentiments of the peoples in neutral, positive, and negative value.
- White space removing
- Upper to lower case conversion
- Stop words removal.
There are two arrangements of words. One is totally positive, while the other is both positive and negative. The calculation scans the text for terms that match the arrangement of models. The calculation then, at that point, figures out which word types are more in the text. The text is called positive extremity assuming there are more positive words. The issue with calculations in view of guideline is that, when they produce a few outcomes, they need adaptability and precision needed to be genuinely useful.
Benefits of the Project87% Accuracy
Fast Processing
for all users
Easy to use Users friendly
Technical Details of Final DeliverableThis tool will be based on Machine Learning Naïve Bayes, it can be run on Personal Computers.
Final Deliverable of the Project Software SystemCore Industry EducationOther Industries IT Core Technology OthersOther TechnologiesSustainable Development Goals Quality EducationRequired Resources| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Total in (Rs) | 8000 | |||
| Printing | Miscellaneous | 110 | 10 | 1100 |
| Report Binding | Miscellaneous | 2 | 200 | 400 |
| SSD Hard Drive | Equipment | 1 | 6500 | 6500 |