Beyond Rouge: a Comprehensive Evaluation Metric for Abstractive Summarization Leveraging Similarity, Entailment, and Acceptability

No Thumbnail Available

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

World Scientific Publ Co Pte Ltd

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Average

Research Projects

Journal Issue

Abstract

A vast amount of textual information on the internet has amplified the importance of text summarization models. Abstractive summarization generates original words and sentences that may not exist in the source document to be summarized. Such abstractive models may suffer from shortcomings such as linguistic acceptability and hallucinations. Recall-Oriented Understudy for Gisting Evaluation (ROUGE) is a metric commonly used to evaluate abstractive summarization models. However, due to its n-gram-based approach, it ignores several critical linguistic aspects. In this work, we propose Similarity, Entailment, and Acceptability Score (SEAScore), an automatic evaluation metric for evaluating abstractive text summarization models using the power of state-of-the-art pre-trained language models. SEAScore comprises three language models (LMs) that extract meaningful linguistic features from candidate and reference summaries and a weighted sum aggregator that computes an evaluation score. Experimental results show that our LM-based SEAScore metric correlates better with human judgment than standard evaluation metrics such as ROUGE-N and BERTScore.

Description

YILDIZ, Beytullah/0000-0001-7664-5145; Briman, Mohammed Khalid Hilmi/0009-0000-5785-6916

Keywords

Machine learning, deep learning, natural language processing, transformer, text summarization, language models

Turkish CoHE Thesis Center URL

Fields of Science

0404 agricultural biotechnology, 0202 electrical engineering, electronic engineering, information engineering, 04 agricultural and veterinary sciences, 02 engineering and technology

Citation

WoS Q

Q4

Scopus Q

Q3
OpenCitations Logo
OpenCitations Citation Count
N/A

Source

International Journal on Artificial Intelligence Tools

Volume

33

Issue

5

Start Page

End Page

Collections

PlumX Metrics
Citations

Scopus : 10

Captures

Mendeley Readers : 13

SCOPUS™ Citations

10

checked on Feb 04, 2026

Web of Science™ Citations

6

checked on Feb 04, 2026

Page Views

3

checked on Feb 04, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
7.02656404

Sustainable Development Goals

1

NO POVERTY
NO POVERTY Logo

3

GOOD HEALTH AND WELL-BEING
GOOD HEALTH AND WELL-BEING Logo

4

QUALITY EDUCATION
QUALITY EDUCATION Logo

5

GENDER EQUALITY
GENDER EQUALITY Logo

7

AFFORDABLE AND CLEAN ENERGY
AFFORDABLE AND CLEAN ENERGY Logo

8

DECENT WORK AND ECONOMIC GROWTH
DECENT WORK AND ECONOMIC GROWTH Logo

9

INDUSTRY, INNOVATION AND INFRASTRUCTURE
INDUSTRY, INNOVATION AND INFRASTRUCTURE Logo

10

REDUCED INEQUALITIES
REDUCED INEQUALITIES Logo

12

RESPONSIBLE CONSUMPTION AND PRODUCTION
RESPONSIBLE CONSUMPTION AND PRODUCTION Logo

16

PEACE, JUSTICE AND STRONG INSTITUTIONS
PEACE, JUSTICE AND STRONG INSTITUTIONS Logo

17

PARTNERSHIPS FOR THE GOALS
PARTNERSHIPS FOR THE GOALS Logo