Related papers: Aggregate Queries on Sparse Databases

Towards Approximate Query Enumeration with Sublinear Preprocessing Time

This paper aims at providing extremely efficient algorithms for approximate query enumeration on sparse databases, that come with performance and accuracy guarantees. We introduce a new model for approximate query enumeration on classes of…

Databases · Computer Science 2021-01-19 Isolde Adler , Polly Fahey

Partitioning Unstructured Sparse Tensor Algebra for Load-Balanced Parallel Execution

Sparse tensor algebra is challenging to efficiently parallelize due to the irregular, data-dependent, and potentially skewed structure of sparse computation. We propose the first partitioning algorithm that provably load balances the…

Programming Languages · Computer Science 2026-04-23 Atharva Chougule , Alexander J Root , Rubens Lacouture , Bobby Yan , Rohan Yadav , Fredrik Kjolstad

Aggregation in Probabilistic Databases via Knowledge Compilation

This paper presents a query evaluation technique for positive relational algebra queries with aggregates on a representation system for probabilistic data based on the algebraic structures of semiring and semimodule. The core of our…

Databases · Computer Science 2012-02-01 Robert Fink , Larisa Han , Dan Olteanu

A Ranking Framework for Network Resource Allocation and Scheduling via Hypergraphs

Resource allocation and scheduling are a common problem in various distributed systems. Although widely studied, the state-of-the-art solutions either do not scale or lack the expressive power to capture the most complex instances of the…

Data Structures and Algorithms · Computer Science 2025-06-03 Rajpreet Singh , Novak Boškov , Aditya Gudal , Manzoor A. Khan

An Aggregate and Iterative Disaggregate Algorithm with Proven Optimality in Machine Learning

We propose a clustering-based iterative algorithm to solve certain optimization problems in machine learning, where we start the algorithm by aggregating the original data, solving the problem on aggregated data, and then in subsequent…

Machine Learning · Statistics 2017-01-23 Young Woong Park , Diego Klabjan

Robust Aggregation for Federated Sequential Recommendation with Sparse and Poisoned Data

Federated sequential recommendation distributes model training across user devices so that behavioural data remains local, reducing privacy risks. Yet, this setting introduces two intertwined difficulties. On the one hand, individual…

Information Retrieval · Computer Science 2026-03-02 Minh Hieu Nguyen

Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection

Large language models (LLMs) augmented with retrieval exhibit robust performance and extensive versatility by incorporating external contexts. However, the input length grows linearly in the number of retrieved documents, causing a dramatic…

Computation and Language · Computer Science 2024-05-28 Yun Zhu , Jia-Chen Gu , Caitlin Sikora , Ho Ko , Yinxiao Liu , Chu-Cheng Lin , Lei Shu , Liangchen Luo , Lei Meng , Bang Liu , Jindong Chen

Aggregate Estimation Over Dynamic Hidden Web Databases

Many databases on the web are "hidden" behind (i.e., accessible only through) their restrictive, form-like, search interfaces. Recent studies have shown that it is possible to estimate aggregate query answers over such hidden web databases…

Databases · Computer Science 2014-05-02 Weimo Liu , Saravanan Thirumuruganathan , Nan Zhang , Gautam Das

A framework for computing upper bounds in passive learning settings

The task of inferring logical formulas from examples has garnered significant attention as a means to assist engineers in creating formal specifications used in the design, synthesis, and verification of computing systems. Among various…

Logic in Computer Science · Computer Science 2025-06-04 Benjamin Bordais , Daniel Neider

A Unifying Framework for Sparsity Constrained Optimization

In this paper, we consider the optimization problem of minimizing a continuously differentiable function subject to both convex constraints and sparsity constraints. By exploiting a mixed-integer reformulation from the literature, we define…

Optimization and Control · Mathematics 2021-04-28 M. Lapucci , T. Levato , F. Rinaldi , M. Sciandrone

Efficiency Optimizations for Superblock-based Sparse Retrieval

Learned sparse retrieval (LSR) is a popular method for first-stage retrieval because it combines the semantic matching of language models with efficient CPU-friendly algorithms. Previous work aggregates blocks into "superblocks" to quickly…

Information Retrieval · Computer Science 2026-02-04 Parker Carlson , Wentai Xie , Rohil Shah , Tao Yang

Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

We consider a discrete optimization formulation for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features. Recent work has shown that mixed integer programming (MIP) can be used to…

Machine Learning · Statistics 2021-06-08 Antoine Dedieu , Hussein Hazimeh , Rahul Mazumder

Optimal Algorithms for Ranked Enumeration of Answers to Full Conjunctive Queries

We study ranked enumeration of join-query results according to very general orders defined by selective dioids. Our main contribution is a framework for ranked enumeration over a class of dynamic programming problems that generalizes…

Databases · Computer Science 2020-09-15 Nikolaos Tziavelis , Deepak Ajwani , Wolfgang Gatterbauer , Mirek Riedewald , Xiaofeng Yang

Compressed Indexing for Consecutive Occurrences

The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact…

Data Structures and Algorithms · Computer Science 2023-04-04 Paweł Gawrychowski , Garance Gourdel , Tatiana Starikovskaya , Teresa Anna Steiner

DBCSR: A Blocked Sparse Tensor Algebra Library

Advanced algorithms for large-scale electronic structure calculations are mostly based on processing multi-dimensional sparse data. Examples are sparse matrix-matrix multiplications in linear-scaling Kohn-Sham calculations or the efficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-31 Ilia Sivkov , Patrick Seewald , Alfio Lazzaro , Juerg Hutter

Efficient sorting, duplicate removal, grouping, and aggregation

Database query processing requires algorithms for duplicate removal, grouping, and aggregation. Three algorithms exist: in-stream aggregation is most efficient by far but requires sorted input; sort-based aggregation relies on external…

Databases · Computer Science 2022-09-27 Thanh Do , Goetz Graefe , Jeffrey Naughton

Sparse approximation problem: how rapid simulated annealing succeeds and fails

Information processing techniques based on sparseness have been actively studied in several disciplines. Among them, a mathematical framework to approximately express a given dataset by a combination of a small number of basis vectors of an…

Information Theory · Computer Science 2016-05-04 Tomoyuki Obuchi , Yoshiyuki Kabashima

Aggregation and Ordering in Factorised Databases

A common approach to data analysis involves understanding and manipulating succinct representations of data. In earlier work, we put forward a succinct representation system for relational data called factorised databases and reported on…

Databases · Computer Science 2013-07-02 Nurzhan Bakibayev , Tomáš Kočiský , Dan Olteanu , Jakub Závodný

Minimax and Communication-Efficient Distributed Best Subset Selection with Oracle Property

The explosion of large-scale data in fields such as finance, e-commerce, and social media has outstripped the processing capabilities of single-machine systems, driving the need for distributed statistical inference methods. Traditional…

Machine Learning · Statistics 2024-09-02 Jingguo Lan , Hongmei Lin , Xueqin Wang

The Sparse Principal Component Analysis Problem: Optimality Conditions and Algorithms

Sparse principal component analysis addresses the problem of finding a linear combination of the variables in a given data set with a sparse coefficients vector that maximizes the variability of the data. This model enhances the ability to…

Optimization and Control · Mathematics 2017-03-09 Amir Beck , Yakov Vaisbourd