Related papers: PANDA: Query Evaluation in Submodular Width

What do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog have to do with one another?

Recent works on bounding the output size of a conjunctive query with functional dependencies and degree constraints have shown a deep connection between fundamental questions in information theory and database theory. We prove analogous…

Databases · Computer Science 2023-12-27 Mahmoud Abo Khamis , Hung Q. Ngo , Dan Suciu

Query Optimization and Evaluation via Information Theory: A Tutorial

Database theory is exciting because it studies highly general and practically useful abstractions. Conjunctive query (CQ) evaluation is a prime example: it simultaneously generalizes graph pattern matching, constraint satisfaction, and…

Databases · Computer Science 2026-04-07 Mahmoud Abo Khamis , Hung Q. Ngo , Dan Suciu

Jaguar: A Primal Algorithm for Conjunctive Query Evaluation in Submodular-Width Time

The submodular width is a complexity measure of conjunctive queries (CQs), which assigns a nonnegative real number, subw(Q), to each CQ Q. An existing algorithm, called PAND, performs CQ evaluation in polynomial time where the exponent is…

Databases · Computer Science 2026-04-08 Mahmoud Abo Khamis , Hubie Chen

PANDAExpress: a Simpler and Faster PANDA Algorithm

PANDA is a powerful generic algorithm for answering conjunctive queries (CQs) and disjunctive datalog rules (DDRs) given input degree constraints. In the special case where degree constraints are cardinality constraints and the query is…

Databases · Computer Science 2026-04-08 Mahmoud Abo Khamis , Hung Q. Ngo , Dan Suciu

Join Size Bounds using Lp-Norms on Degree Sequences

Estimating the output size of a query is a fundamental yet longstanding problem in database query processing. Traditional cardinality estimators used by database systems can routinely underestimate the true output size by orders of…

Databases · Computer Science 2024-06-07 Mahmoud Abo Khamis , Vasileios Nakos , Dan Olteanu , Dan Suciu

Efficient Algorithms for Cardinality Estimation and Conjunctive Query Evaluation With Simple Degree Constraints

Cardinality estimation and conjunctive query evaluation are two of the most fundamental problems in database query processing. Recent work proposed, studied, and implemented a robust and practical information-theoretic cardinality…

Databases · Computer Science 2025-04-04 Sungjin Im , Benjamin Moseley , Hung Q. Ngo , Kirk Pruhs

Information Theory Strikes Back: New Development in the Theory of Cardinality Estimation

Estimating the cardinality of the output of a query is a fundamental problem in database query processing. In this article, we overview a recently published contribution that casts the cardinality estimation problem as linear optimization…

Databases · Computer Science 2025-05-13 Mahmoud Abo Khamis , Vasileios Nakos , Dan Olteanu , Dan Suciu

High Performance Computing of Gene Regulatory Networks using a Message-Passing Model

Gene regulatory network reconstruction is a fundamental problem in computational biology. We recently developed an algorithm, called PANDA (Passing Attributes Between Networks for Data Assimilation), that integrates multiple sources of…

Quantitative Methods · Quantitative Biology 2017-04-18 Kimberly Glass , John Quackenbush , Jeremy Kepner

Contention Resolution with Predictions

In this paper, we consider contention resolution algorithms that are augmented with predictions about the network. We begin by studying the natural setup in which the algorithm is provided a distribution defined over the possible network…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-27 Seth Gilbert , Calvin Newport , Nitin Vaidya , Alex Weaver

PANDA: Predicting the change in proteins binding affinity upon mutations using sequence information

Accurately determining a change in protein binding affinity upon mutations is important for the discovery and design of novel therapeutics and to assist mutagenesis studies. Determination of change in binding affinity upon mutations…

Biomolecules · Quantitative Biology 2021-09-01 Wajid Arshad Abbasi , Syed Ali Abbas , Saiqa Andleeb

Quantum Information-Theoretical Size Bounds for Conjunctive Queries with Functional Dependencies

Deriving formulations for computing and estimating tight worst-case size increases for conjunctive queries with various constraints has been at the core of theoretical database research. If the problem has no constraints or only one…

Quantum Physics · Physics 2025-06-10 Valter Uotila , Jiaheng Lu

Information inequality problem over set functions

Information inequalities appear in many database applications such as query output size bounds, query containment, and implication between data dependencies. Recently Khamis et al. proposed to study the algorithmic aspects of information…

Databases · Computer Science 2023-09-22 Miika Hannula

Pivotal Estimation of Linear Discriminant Analysis in High Dimensions

We consider the linear discriminant analysis problem in the high-dimensional settings. In this work, we propose PANDA(PivotAl liNear Discriminant Analysis), a tuning-insensitive method in the sense that it requires very little effort to…

Statistics Theory · Mathematics 2023-09-19 Ethan X. Fang , Yajun Mei , Yuyang Shi , Qunzhi Xu , Tuo Zhao

Lower Bounds for the Algorithmic Complexity of Learned Indexes

Learned index structures aim to accelerate queries by training machine learning models to approximate the rank function associated with a database attribute. While effective in practice, their theoretical limitations are not fully…

Data Structures and Algorithms · Computer Science 2026-01-13 Luis Alberto Croquevielle , Roman Sokolovskii , Thomas Heinis

Blend: A Unified Data Discovery System

Most research on data discovery has so far focused on improving individual discovery operators such as join, correlation, or union discovery. However, in practice, a combination of these techniques and their corresponding indexes may be…

Databases · Computer Science 2024-12-02 Mahdi Esmailoghli , Christoph Schnell , Renée J. Miller , Ziawasch Abedjan

Support Size Estimation: The Power of Conditioning

We consider the problem of estimating the support size of a distribution $D$. Our investigations are pursued through the lens of distribution testing and seek to understand the power of conditional sampling (denoted as COND), wherein one is…

Data Structures and Algorithms · Computer Science 2022-11-23 Diptarka Chakraborty , Gunjan Kumar , Kuldeep S. Meel

Information Inequalities for Joint Distributions, with Interpretations and Applications

Upper and lower bounds are obtained for the joint entropy of a collection of random variables in terms of an arbitrary collection of subset joint entropies. These inequalities generalize Shannon's chain rule for entropy as well as…

Information Theory · Computer Science 2024-05-07 Mokshay Madiman , Prasad Tetali

Degree Sequence Bound For Join Cardinality Estimation

Recent work has demonstrated the catastrophic effects of poor cardinality estimates on query processing time. In particular, underestimating query cardinality can result in overly optimistic query plans which take orders of magnitude longer…

Databases · Computer Science 2022-03-31 Kyle Deeds , Dan Suciu , Magda Balazinska , Walter Cai

Guaranteeing the \~O(AGM/OUT) Runtime for Uniform Sampling and OUT Size Estimation over Joins

We propose a new method for estimating the number of answers OUT of a small join query Q in a large database D, and for uniform sampling over joins. Our method is the first to satisfy all the following statements. - Support arbitrary Q,…

Databases · Computer Science 2023-04-11 Kyoungmin Kim , Jaehyun Ha , George Fletcher , Wook-Shin Han

Bounds in Query Learning

We introduce new combinatorial quantities for concept classes, and prove lower and upper bounds for learning complexity in several models of query learning in terms of various combinatorial quantities. Our approach is flexible and powerful…

Machine Learning · Computer Science 2019-04-24 Hunter Chase , James Freitag