Related papers: Structure-Aware Sampling: Flexible and Accurate Su…

On the variance of subset sum estimation

For high volume data streams and large data warehouses, sampling is used for efficient approximate answers to aggregate queries over selected subsets. Mathematically, we are dealing with a set of weighted items and want to support queries…

Data Structures and Algorithms · Computer Science 2007-05-23 Mario Szegedy , Mikkel Thorup

Scalable Approximation Algorithm for Graph Summarization

Massive sizes of real-world graphs, such as social networks and web graph, impose serious challenges to process and perform analytics on them. These issues can be resolved by working on a small summary of the graph instead . A summary is a…

Data Structures and Algorithms · Computer Science 2018-06-12 Maham Anwar Beg , Muhammad Ahmad , Arif Zaman , Imdadullah Khan

Adaptive Summaries: A Personalized Concept-based Summarization Approach by Learning from Users' Feedback

Exploring the tremendous amount of data efficiently to make a decision, similar to answering a complicated question, is challenging with many real-world application scenarios. In this context, automatic summarization has substantial…

Artificial Intelligence · Computer Science 2021-12-21 Samira Ghodratnama , Mehrdad Zakershahrak , Fariborz Sobhanmanesh

StructSum: Summarization via Structured Representations

Abstractive text summarization aims at compressing the information of a long source document into a rephrased, condensed summary. Despite advances in modeling techniques, abstractive summarization models still suffer from several key…

Computation and Language · Computer Science 2021-02-17 Vidhisha Balachandran , Artidoro Pagnoni , Jay Yoon Lee , Dheeraj Rajagopal , Jaime Carbonell , Yulia Tsvetkov

Structure-Aware Decoding Mechanisms for Complex Entity Extraction with Large-Scale Language Models

This paper proposes a structure-aware decoding method based on large language models to address the difficulty of traditional approaches in maintaining both semantic integrity and structural consistency in nested and overlapping entity…

Computation and Language · Computer Science 2026-01-29 Zhimin Qiu , Di Wu , Feng Liu , Yuxiao Wang

Query-adaptive Video Summarization via Quality-aware Relevance Estimation

Although the problem of automatic video summarization has recently received a lot of attention, the problem of creating a video summary that also highlights elements relevant to a search query has been less studied. We address this problem…

Computer Vision and Pattern Recognition · Computer Science 2017-09-29 Arun Balajee Vasudevan , Michael Gygli , Anna Volokitin , Luc Van Gool

Adaptive Threshold Sampling

Sampling is a fundamental problem in computer science and statistics. However, for a given task and stream, it is often not possible to choose good sampling probabilities in advance. We derive a general framework for adaptively changing the…

Machine Learning · Statistics 2022-06-16 Daniel Ting

How well do you know your summarization datasets?

State-of-the-art summarization systems are trained and evaluated on massive datasets scraped from the web. Despite their prevalence, we know very little about the underlying characteristics (data noise, summarization complexity, etc.) of…

Computation and Language · Computer Science 2021-06-23 Priyam Tejaswin , Dhruv Naik , Pengfei Liu

Scalable Rule Lists Learning with Sampling

Learning interpretable models has become a major focus of machine learning research, given the increasing prominence of machine learning in socially important decision-making. Among interpretable models, rule lists are among the best-known…

Machine Learning · Computer Science 2024-06-19 Leonardo Pellegrina , Fabio Vandin

Beyond One-Size-Fits-All Summarization: Customizing Summaries for Diverse Users

In recent years, automatic text summarization has witnessed significant advancement, particularly with the development of transformer-based models. However, the challenge of controlling the readability level of generated summaries remains…

Computation and Language · Computer Science 2025-03-17 Mehmet Samet Duran , Tevfik Aytekin

Spatial Random Sampling: A Structure-Preserving Data Sketching Tool

Random column sampling is not guaranteed to yield data sketches that preserve the underlying structures of the data and may not sample sufficiently from less-populated data clusters. Also, adaptive sampling can often provide accurate low…

Machine Learning · Computer Science 2017-10-11 Mostafa Rahmani , George Atia

Topic-Controllable Summarization: Topic-Aware Evaluation and Transformer Methods

Topic-controllable summarization is an emerging research area with a wide range of potential applications. However, existing approaches suffer from significant limitations. For example, the majority of existing methods built upon recurrent…

Computation and Language · Computer Science 2024-04-18 Tatiana Passali , Grigorios Tsoumakas

Independent Range Sampling, Revisited Again

We revisit the range sampling problem: the input is a set of points where each point is associated with a real-valued weight. The goal is to store them in a structure such that given a query range and an integer $k$, we can extract $k$…

Data Structures and Algorithms · Computer Science 2019-03-20 Peyman Afshani , Jeff M. Phillips

Can Constructions "SCAN" Compositionality ?

Sequence to Sequence models struggle at compositionality and systematic generalisation even while they excel at many other tasks. We attribute this limitation to their failure to internalise constructions conventionalised form meaning…

Computation and Language · Computer Science 2025-09-25 Ganesh Katrapati , Manish Shrivastava

Abstractive Summarization Using Attentive Neural Techniques

In a world of proliferating data, the ability to rapidly summarize text is growing in importance. Automatic summarization of text can be thought of as a sequence to sequence problem. Another area of natural language processing that solves a…

Computation and Language · Computer Science 2018-10-23 Jacob Krantz , Jugal Kalita

Storyboard: Optimizing Precomputed Summaries for Aggregation

An emerging class of data systems partition their data and precompute approximate summaries (i.e., sketches and samples) for each segment to reduce query costs. They can then aggregate and combine the segment summaries to estimate results…

Databases · Computer Science 2020-02-11 Edward Gan , Peter Bailis , Moses Charikar

What Makes a Good and Useful Summary? Incorporating Users in Automatic Summarization Research

Automatic text summarization has enjoyed great progress over the years and is used in numerous applications, impacting the lives of many. Despite this development, there is little research that meaningfully investigates how the current…

Computation and Language · Computer Science 2022-05-02 Maartje ter Hoeve , Julia Kiseleva , Maarten de Rijke

Non-Adaptive Adaptive Sampling on Turnstile Streams

Adaptive sampling is a useful algorithmic tool for data summarization problems in the classical centralized setting, where the entire dataset is available to the single processor performing the computation. Adaptive sampling repeatedly…

Data Structures and Algorithms · Computer Science 2020-04-24 Sepideh Mahabadi , Ilya Razenshteyn , David P. Woodruff , Samson Zhou

Scalable Sampling for High Utility Patterns

Discovering valuable insights from data through meaningful associations is a crucial task. However, it becomes challenging when trying to identify representative patterns in quantitative databases, especially with large datasets, as…

Databases · Computer Science 2024-10-31 Lamine Diop , Marc Plantevit

What are the Desired Characteristics of Calibration Sets? Identifying Correlates on Long Form Scientific Summarization

Summarization models often generate text that is poorly calibrated to quality metrics because they are trained to maximize the likelihood of a single reference (MLE). To address this, recent work has added a calibration step, which exposes…

Computation and Language · Computer Science 2023-05-15 Griffin Adams , Bichlien H Nguyen , Jake Smith , Yingce Xia , Shufang Xie , Anna Ostropolets , Budhaditya Deb , Yuan-Jyue Chen , Tristan Naumann , Noémie Elhadad