Related papers: Reference and Document Aware Semantic Evaluation M…

Better Summarization Evaluation with Word Embeddings for ROUGE

ROUGE is a widely adopted, automatic evaluation measure for text summarization. While it has been shown to correlate well with human judgements, it is biased towards surface lexical similarities. This makes it unsuitable for the evaluation…

Computation and Language · Computer Science 2015-08-26 Jun-Ping Ng , Viktoria Abrecht

SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative Sampling

Canonical automatic summary evaluation metrics, such as ROUGE, focus on lexical similarity which cannot well capture semantics nor linguistic quality and require a reference summary which is costly to obtain. Recently, there have been a…

Computation and Language · Computer Science 2022-05-06 Forrest Sheng Bao , Hebi Li , Ge Luo , Minghui Qiu , Yinfei Yang , Youbiao He , Cen Chen

Understanding the Extent to which Summarization Evaluation Metrics Measure the Information Quality of Summaries

Reference-based metrics such as ROUGE or BERTScore evaluate the content quality of a summary by comparing the summary to a reference. Ideally, this comparison should measure the summary's information quality by calculating how much…

Computation and Language · Computer Science 2020-10-26 Daniel Deutsch , Dan Roth

Revisiting Summarization Evaluation for Scientific Articles

Evaluation of text summarization approaches have been mostly based on metrics that measure similarities of system generated summaries with a set of human written gold-standard summaries. The most widely used metric in summarization…

Computation and Language · Computer Science 2016-04-05 Arman Cohan , Nazli Goharian

ROUGE 2.0: Updated and Improved Measures for Evaluation of Summarization Tasks

Evaluation of summarization tasks is extremely crucial to determining the quality of machine generated summaries. Over the last decade, ROUGE has become the standard automatic evaluation measure for evaluating summarization tasks. While…

Information Retrieval · Computer Science 2018-03-07 Kavita Ganesan

A Semantically Motivated Approach to Compute ROUGE Scores

ROUGE is one of the first and most widely used evaluation metrics for text summarization. However, its assessment merely relies on surface similarities between peer and model summaries. Consequently, ROUGE is unable to fairly evaluate…

Computation and Language · Computer Science 2017-10-23 Elaheh ShafieiBavani , Mohammad Ebrahimi , Raymond Wong , Fang Chen

Source code summarization involves creating brief descriptions of source code in natural language. These descriptions are a key component of software documentation such as JavaDocs. Automatic code summarization is a prized target of…

Software Engineering · Computer Science 2022-04-05 Sakib Haque , Zachary Eberhart , Aakash Bansal , Collin McMillan

Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics

How reliably an automatic summarization evaluation metric replicates human judgments of summary quality is quantified by system-level correlations. We identify two ways in which the definition of the system-level correlation is inconsistent…

Computation and Language · Computer Science 2022-04-22 Daniel Deutsch , Rotem Dror , Dan Roth

QuestEval: Summarization Asks for Fact-based Evaluation

Summarization evaluation remains an open research problem: current metrics such as ROUGE are known to be limited and to correlate poorly with human judgments. To alleviate this issue, recent work has proposed evaluation metrics which rely…

Computation and Language · Computer Science 2021-04-12 Thomas Scialom , Paul-Alexis Dray , Patrick Gallinari , Sylvain Lamprier , Benjamin Piwowarski , Jacopo Staiano , Alex Wang

Re-evaluating Evaluation in Text Summarization

Automated evaluation metrics as a stand-in for manual evaluation are an essential part of the development of text-generation tasks such as text summarization. However, while the field has progressed, our standard metrics have not -- for…

Computation and Language · Computer Science 2020-10-15 Manik Bhandari , Pranav Gour , Atabak Ashfaq , Pengfei Liu , Graham Neubig

Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations

Evaluating multi-document summarization (MDS) quality is difficult. This is especially true in the case of MDS for biomedical literature reviews, where models must synthesize contradicting evidence reported across different documents. Prior…

Computation and Language · Computer Science 2023-05-24 Lucy Lu Wang , Yulia Otmakhova , Jay DeYoung , Thinh Hung Truong , Bailey E. Kuehl , Erin Bransom , Byron C. Wallace

Answers Unite! Unsupervised Metrics for Reinforced Summarization Models

Abstractive summarization approaches based on Reinforcement Learning (RL) have recently been proposed to overcome classical likelihood maximization. RL enables to consider complex, possibly non-differentiable, metrics that globally assess…

Computation and Language · Computer Science 2019-09-05 Thomas Scialom , Sylvain Lamprier , Benjamin Piwowarski , Jacopo Staiano

RISE: Leveraging Retrieval Techniques for Summarization Evaluation

Evaluating automatically-generated text summaries is a challenging task. While there have been many interesting approaches, they still fall short of human evaluations. We present RISE, a new approach for evaluating summaries by leveraging…

Computation and Language · Computer Science 2023-05-23 David Uthus , Jianmo Ni

Evaluating Code Summarization Techniques: A New Metric and an Empirical Characterization

Several code summarization techniques have been proposed in the literature to automatically document a code snippet or a function. Ideally, software developers should be involved in assessing the quality of the generated summaries. However,…

Software Engineering · Computer Science 2023-12-27 Antonio Mastropaolo , Matteo Ciniselli , Massimiliano Di Penta , Gabriele Bavota

WIDAR -- Weighted Input Document Augmented ROUGE

The task of automatic text summarization has gained a lot of traction due to the recent advancements in machine learning techniques. However, evaluating the quality of a generated summary remains to be an open problem. The literature has…

Computation and Language · Computer Science 2022-01-25 Raghav Jain , Vaibhav Mavi , Anubhav Jangra , Sriparna Saha

An analysis of document graph construction methods for AMR summarization

Meaning Representation (AMR) is a graph-based semantic representation for sentences, composed of collections of concepts linked by semantic relations. AMR-based approaches have found success in a variety of applications, but a challenge to…

Computation and Language · Computer Science 2021-11-30 Fei-Tzin Lee , Chris Kedzie , Nakul Verma , Kathleen McKeown

Quality of syntactic implication of RL-based sentence summarization

Work on summarization has explored both reinforcement learning (RL) optimization using ROUGE as a reward and syntax-aware models, such as models those input is enriched with part-of-speech (POS)-tags and dependency information. However, it…

Computation and Language · Computer Science 2019-12-12 Hoa T. Le , Christophe Cerisara , Claire Gardent

A Training-free and Reference-free Summarization Evaluation Metric via Centrality-weighted Relevance and Self-referenced Redundancy

In recent years, reference-based and supervised summarization evaluation metrics have been widely explored. However, collecting human-annotated references and ratings are costly and time-consuming. To avoid these limitations, we propose a…

Computation and Language · Computer Science 2021-06-29 Wang Chen , Piji Li , Irwin King

A Method of Passage-Based Document Retrieval in Question Answering System

We propose a method for using the scoring values of passages to effectively retrieve documents in a Question Answering system. For this, we suggest evaluation function that considers proximity between each question terms in passage. And…

Information Retrieval · Computer Science 2015-12-18 Man-Hung Jong , Chong-Han Ri , Hyok-Chol Choe , Chol-Jun Hwang

Thesis: Document Summarization with applications to Keyword extraction and Image Retrieval

Automatic summarization is the process of reducing a text document in order to generate a summary that retains the most important points of the original document. In this work, we study two problems - i) summarizing a text document as set…

Information Retrieval · Computer Science 2024-06-04 Jayaprakash Sundararaj