Beyond Rouge: a Comprehensive Evaluation Metric for Abstractive Summarization Leveraging Similarity, Entailment, and Acceptability

Loading...
Publication Logo

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

World Scientific Publ Co Pte Ltd

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Average

Research Projects

Journal Issue

Abstract

A vast amount of textual information on the internet has amplified the importance of text summarization models. Abstractive summarization generates original words and sentences that may not exist in the source document to be summarized. Such abstractive models may suffer from shortcomings such as linguistic acceptability and hallucinations. Recall-Oriented Understudy for Gisting Evaluation (ROUGE) is a metric commonly used to evaluate abstractive summarization models. However, due to its n-gram-based approach, it ignores several critical linguistic aspects. In this work, we propose Similarity, Entailment, and Acceptability Score (SEAScore), an automatic evaluation metric for evaluating abstractive text summarization models using the power of state-of-the-art pre-trained language models. SEAScore comprises three language models (LMs) that extract meaningful linguistic features from candidate and reference summaries and a weighted sum aggregator that computes an evaluation score. Experimental results show that our LM-based SEAScore metric correlates better with human judgment than standard evaluation metrics such as ROUGE-N and BERTScore.

Description

YILDIZ, Beytullah/0000-0001-7664-5145; Briman, Mohammed Khalid Hilmi/0009-0000-5785-6916

Keywords

Machine learning, deep learning, natural language processing, transformer, text summarization, language models

Fields of Science

0404 agricultural biotechnology, 0202 electrical engineering, electronic engineering, information engineering, 04 agricultural and veterinary sciences, 02 engineering and technology

Citation

WoS Q

Q4

Scopus Q

Q3
OpenCitations Logo
OpenCitations Citation Count
8

Source

International Journal on Artificial Intelligence Tools

Volume

33

Issue

5

Start Page

End Page

Collections

PlumX Metrics
Citations

Scopus : 11

Captures

Mendeley Readers : 15

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
4.3644

Sustainable Development Goals

SDG data is not available