Semantic base summary evaluator

Project Title

Project Area of Specialization

Artificial Intelligence

Project Summary

Now a days everyone has the facility of mobile phones and Internet. It is among the most commonly used communication tool in modern era. People prefer text messages over the voice or video conversations to express their emotions and to convey informative things. The textual data hence being the most common data available on the internet in the form of news, blogs, email, research papers, electronic books, articles and social media comments or tweets (Al-Numai et al, 2020). The 2.5 quintillion bytes data in the form of texts, digital photos, audio & videos, created each day across various platforms on the Internet. The facility of Internet has made such a huge amount of data accessible to 40% internet connected human beings on the planet (Ahmed, 2019).

It is difficult to read and to get knowledge of such a large amount of text. In some areas, reading and understanding long articles takes time and effort. Therefore, to overcome the problem of extracting knowledge from the larger text, the text summarization was introduced in 1958 by P. Luhn. The automatic summarization of texts can be considered as a viable solution for use in various fields and applications. As the electronic text document increases in number, so does increases the demand for automatic text summarization.

Summarization is a technique in which we create simple, yet informative and reduced form of text from original data (Ahmed, 2019). The reduced form can vary according to the input data sources and the reduction rate of 5-40% of original text is considered acceptable (Fattah, 2014; Uddin, 2007; Mosa et al., 2018; Anand, Wagh, 2019; Goularte et al., 2019; Mohd et al., 2020; Mutlu et al., 2019). Thus, the purpose of automatic text summarization is to present the input source text into a small simple version with meaningful information.

Based on output type the Text summarization can be categorized into two methods; extractive summary and abstractive summary (Tas, Kiyani, 2017; Pontes et al., 2019; Raphal et al., 2018; Bhatti et al., 2019; Alami et al., 2019; Alfarra et al., 2019). The research project focuses on Abstractive summary which is much more similar to the way human summarize. When we human summarize, we just don’t extract sentences. Our brain looks differently, we create some internal representation of the text that we just read and then from that we actually generate summary using our own words. The words that is used in the summary may not be available in the original text. The main focus will be on the evaluation of the summaries. Evaluation is the most important phase of text summarization process which compare the results of summarization algorithms. We use the summary generated by the system Baseline summary (mostly artificial). Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin, 2004) and Bilingual Evaluation Understudy (BLUE) assessment technique (Papineni et al., 2002) are two frequently used outcome assessment matrices.

Project Objectives

The research project focus on creating the abstractive summaries using the latest SMITH algorithm.
The current ROUGE metrics is biased towards lexical similarity between systems generated summaries and human made summaries (ShafieiBavani, 2018). The thesis thus seek to enhance the evaluation model and will help ROUGE identifying the semantics similarities of text.

Project Implementation Method

We propose a Semantic-based approach (Rouge-M) to overcome the limitation of high lexical dependency in Rouge. We disambiguate each word into its intended sense, and obtain the probability distribution of each sense over all senses in the network. Weights in this distribution denote the relevance of the corresponding senses.
At each iteration, we measure the semantic similarity by looking at the path taken by the random walker, and weighting the overlaps between a pair of ranked vectors. Our approach computes semantic similarity scores betwee n-grams, along with their match counts, to perform both semantic and lexical comparisons of peer and model summaries.

Benefits of the Project

The created summaries will be more close to human generated summaries and thus the semantic based evaluator will asses the summaries based on thier emotional and semantical basis.

Technical Details of Final Deliverable

Encoder decoder based abstractive model using the smith algorithm and semantic based ROUGE-M model.

Final Deliverable of the Project

Software System

Core Industry

Education

Other Industries

Core Technology

Artificial Intelligence(AI)

Other Technologies

Sustainable Development Goals

Required Resources

Item Name	Type	No. of Units	Per Unit Cost (in Rs)	Total (in Rs)
GPU	Equipment	1	50000	50000
			Total in (Rs)	50000

If you need this project, please contact me on contact@adikhanofficial.com

103

Comments 0

Design of Brain controlled electric Wheel Chair for Paraplegic Persons

Many people are born with disabilities that restrict their movement. For a quadriplegic pe...

Adil Khan

10 months ago

Digital Customer onboarding

Digital Customer onboarding system will automate the customer onboarding process of financ...

Adil Khan

10 months ago

Wireless power transmission through Rectenna and its utilization.

In this Project we introduce another idea for WPT, through Microwave Power transmission (M...

Adil Khan

10 months ago

Design and implementation of a low-cost single-channel, EEG prototype

Project Summary:             &...

Adil Khan

10 months ago

Design and development of servo hydraulic controller for manual univer...

Universal Testing Machine (UTM) is being used for different structural strength tests, lik...

Adil Khan

10 months ago