Beyond Rouge: a Comprehensive Evaluation Metric for Abstractive Summarization Leveraging Similarity, Entailment, and Acceptability
No Thumbnail Available
Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
World Scientific Publ Co Pte Ltd
Open Access Color
Green Open Access
No
OpenAIRE Downloads
OpenAIRE Views
Publicly Funded
No
Abstract
A vast amount of textual information on the internet has amplified the importance of text summarization models. Abstractive summarization generates original words and sentences that may not exist in the source document to be summarized. Such abstractive models may suffer from shortcomings such as linguistic acceptability and hallucinations. Recall-Oriented Understudy for Gisting Evaluation (ROUGE) is a metric commonly used to evaluate abstractive summarization models. However, due to its n-gram-based approach, it ignores several critical linguistic aspects. In this work, we propose Similarity, Entailment, and Acceptability Score (SEAScore), an automatic evaluation metric for evaluating abstractive text summarization models using the power of state-of-the-art pre-trained language models. SEAScore comprises three language models (LMs) that extract meaningful linguistic features from candidate and reference summaries and a weighted sum aggregator that computes an evaluation score. Experimental results show that our LM-based SEAScore metric correlates better with human judgment than standard evaluation metrics such as ROUGE-N and BERTScore.
Description
YILDIZ, Beytullah/0000-0001-7664-5145; Briman, Mohammed Khalid Hilmi/0009-0000-5785-6916
Keywords
Machine learning, deep learning, natural language processing, transformer, text summarization, language models
Turkish CoHE Thesis Center URL
Fields of Science
0404 agricultural biotechnology, 0202 electrical engineering, electronic engineering, information engineering, 04 agricultural and veterinary sciences, 02 engineering and technology
Citation
WoS Q
Q4
Scopus Q
Q3

OpenCitations Citation Count
N/A
Source
International Journal on Artificial Intelligence Tools
Volume
33
Issue
5
Start Page
End Page
PlumX Metrics
Citations
Scopus : 10
Captures
Mendeley Readers : 13
SCOPUS™ Citations
10
checked on Feb 04, 2026
Web of Science™ Citations
6
checked on Feb 04, 2026
Page Views
3
checked on Feb 04, 2026
Google Scholar™

OpenAlex FWCI
7.02656404
Sustainable Development Goals
1
NO POVERTY

3
GOOD HEALTH AND WELL-BEING

4
QUALITY EDUCATION

5
GENDER EQUALITY

7
AFFORDABLE AND CLEAN ENERGY

8
DECENT WORK AND ECONOMIC GROWTH

9
INDUSTRY, INNOVATION AND INFRASTRUCTURE

10
REDUCED INEQUALITIES

12
RESPONSIBLE CONSUMPTION AND PRODUCTION

16
PEACE, JUSTICE AND STRONG INSTITUTIONS

17
PARTNERSHIPS FOR THE GOALS


