Related papers: SummEval: Re-evaluating Summarization Evaluation

Neural Text Summarization: A Critical Evaluation

Text summarization aims at compressing long documents into a shorter form that conveys the most important parts of the original document. Despite increased interest in the community and notable research effort, progress on benchmark…

Computation and Language · Computer Science 2019-08-27 Wojciech Kryściński , Nitish Shirish Keskar , Bryan McCann , Caiming Xiong , Richard Socher

Evaluate Summarization in Fine-Granularity: Auto Evaluation with LLM

Due to the exponential growth of information and the need for efficient information consumption the task of summarization has gained paramount importance. Evaluating summarization accurately and objectively presents significant challenges,…

Computation and Language · Computer Science 2024-12-31 Dong Yuan , Eti Rastogi , Fen Zhao , Sagar Goyal , Gautam Naik , Sree Prasanna Rajagopal

OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization

Opinion summarization sets itself apart from other types of summarization tasks due to its distinctive focus on aspects and sentiments. Although certain automated evaluation methods like ROUGE have gained popularity, we have found them to…

Computation and Language · Computer Science 2023-11-14 Yuchen Shen , Xiaojun Wan

RefSum: Refactoring Neural Summarization

Although some recent works show potential complementarity among different state-of-the-art systems, few works try to investigate this problem in text summarization. Researchers in other areas commonly refer to the techniques of reranking or…

Computation and Language · Computer Science 2021-04-16 Yixin Liu , Zi-Yi Dou , Pengfei Liu

A Comparative Study of Quality Evaluation Methods for Text Summarization

Evaluating text summarization has been a challenging task in natural language processing (NLP). Automatic metrics which heavily rely on reference summaries are not suitable in many situations, while human evaluation is time-consuming and…

Computation and Language · Computer Science 2024-07-02 Huyen Nguyen , Haihua Chen , Lavanya Pobbathi , Junhua Ding

Re-evaluating Evaluation in Text Summarization

Automated evaluation metrics as a stand-in for manual evaluation are an essential part of the development of text-generation tasks such as text summarization. However, while the field has progressed, our standard metrics have not -- for…

Computation and Language · Computer Science 2020-10-15 Manik Bhandari , Pranav Gour , Atabak Ashfaq , Pengfei Liu , Graham Neubig

An Empirical Comparison of Text Summarization: A Multi-Dimensional Evaluation of Large Language Models

Text summarization is crucial for mitigating information overload across domains like journalism, medicine, and business. This research evaluates summarization performance across 17 large language models (OpenAI, Google, Anthropic,…

Computation and Language · Computer Science 2025-04-08 Anantharaman Janakiraman , Behnaz Ghoraani

A Critical Look at Meta-evaluating Summarisation Evaluation Metrics

Effective summarisation evaluation metrics enable researchers and practitioners to compare different summarisation systems efficiently. Estimating the effectiveness of an automatic evaluation metric, termed meta-evaluation, is a critically…

Computation and Language · Computer Science 2024-10-01 Xiang Dai , Sarvnaz Karimi , Biaoyan Fang

LLM-ReSum: A Framework for LLM Reflective Summarization through Self-Evaluation

Reliable evaluation of large language model (LLM)-generated summaries remains an open challenge, particularly across heterogeneous domains and document lengths. We conduct a comprehensive meta-evaluation of 14 automatic summarization…

Computation and Language · Computer Science 2026-04-29 Huyen Nguyen , Haoxuan Zhang , Yang Zhang , Junhua Ding , Haihua Chen

Benchmarking Large Language Models for News Summarization

Large language models (LLMs) have shown promise for automatic summarization but the reasons behind their successes are poorly understood. By conducting a human evaluation on ten LLMs across different pretraining methods, prompts, and model…

Computation and Language · Computer Science 2023-02-01 Tianyi Zhang , Faisal Ladhak , Esin Durmus , Percy Liang , Kathleen McKeown , Tatsunori B. Hashimoto

A Survey on Neural Network-Based Summarization Methods

Automatic text summarization, the automated process of shortening a text while reserving the main ideas of the document(s), is a critical research area in natural language processing. The aim of this literature review is to survey the…

Computation and Language · Computer Science 2018-04-13 Yue Dong

Align then Summarize: Automatic Alignment Methods for Summarization Corpus Creation

Summarizing texts is not a straightforward task. Before even considering text summarization, one should determine what kind of summary is expected. How much should the information be compressed? Is it relevant to reformulate or should the…

Computation and Language · Computer Science 2020-07-16 Paul Tardy , David Janiszek , Yannick Estève , Vincent Nguyen

Revisiting Automatic Question Summarization Evaluation in the Biomedical Domain

Automatic evaluation metrics have been facilitating the rapid development of automatic summarization methods by providing instant and fair assessments of the quality of summaries. Most metrics have been developed for the general domain,…

Computation and Language · Computer Science 2023-03-21 Hongyi Yuan , Yaoyun Zhang , Fei Huang , Songfang Huang

EmailSum: Abstractive Email Thread Summarization

Recent years have brought about an interest in the challenging task of summarizing conversation threads (meetings, online discussions, etc.). Such summaries help analysis of the long text to quickly catch up with the decisions made and thus…

Computation and Language · Computer Science 2021-08-02 Shiyue Zhang , Asli Celikyilmaz , Jianfeng Gao , Mohit Bansal

Dimensionality on Summarization

Summarization is one of the key features of human intelligence. It plays an important role in understanding and representation. With rapid and continual expansion of texts, pictures and videos in cyberspace, automatic summarization becomes…

Computation and Language · Computer Science 2015-07-02 Hai Zhuge

SummScore: A Comprehensive Evaluation Metric for Summary Quality Based on Cross-Encoder

Text summarization models are often trained to produce summaries that meet human quality requirements. However, the existing evaluation metrics for summary text are only rough proxies for summary quality, suffering from low correlation with…

Computation and Language · Computer Science 2022-07-12 Wuhang Lin , Shasha Li , Chen Zhang , Bin Ji , Jie Yu , Jun Ma , Zibo Yi

AllSummedUp: un framework open-source pour comparer les metriques d'evaluation de resume

This paper investigates reproducibility challenges in automatic text summarization evaluation. Based on experiments conducted across six representative metrics ranging from classical approaches like ROUGE to recent LLM-based methods…

Computation and Language · Computer Science 2025-09-01 Tanguy Herserant , Vincent Guigue

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

Human evaluation is the foundation upon which the evaluation of both summarization systems and automatic metrics rests. However, existing human evaluation studies for summarization either exhibit a low inter-annotator agreement or have…

Computation and Language · Computer Science 2023-06-07 Yixin Liu , Alexander R. Fabbri , Pengfei Liu , Yilun Zhao , Linyong Nan , Ruilin Han , Simeng Han , Shafiq Joty , Chien-Sheng Wu , Caiming Xiong , Dragomir Radev

Automatic Text Summarization Methods: A Comprehensive Review

One of the most pressing issues that have arisen due to the rapid growth of the Internet is known as information overloading. Simplifying the relevant information in the form of a summary will assist many people because the material on any…

Computation and Language · Computer Science 2022-04-06 Divakar Yadav , Jalpa Desai , Arun Kumar Yadav

UniSumEval: Towards Unified, Fine-Grained, Multi-Dimensional Summarization Evaluation for LLMs

Existing benchmarks for summarization quality evaluation often lack diverse input scenarios, focus on narrowly defined dimensions (e.g., faithfulness), and struggle with subjective and coarse-grained annotation schemes. To address these…

Computation and Language · Computer Science 2024-10-02 Yuho Lee , Taewon Yun , Jason Cai , Hang Su , Hwanjun Song