Related papers: Discovering Multi-Table Functional Dependencies Wi…

Provenance-aware Discovery of Functional Dependencies on Integrated Views

The automatic discovery of functional dependencies(FDs) has been widely studied as one of the hardest problems in data profiling. Existing approaches have focused on making the FD computation efficient while inspecting single relations at a…

Databases · Computer Science 2021-12-17 Ugo Comignani , Laure Berti-Équille , Noël Novelli , Angela Bonifati

EAIFD: A Fast and Scalable Algorithm for Incremental Functional Dependency Discovery

Functional dependencies (FDs) are fundamental integrity constraints in relational databases, but discovering them under incremental updates remains challenging. While static algorithms are inefficient due to full re-execution, incremental…

Databases · Computer Science 2026-01-23 Yajuan Xu , Xixian Han , Xiaolong Wan

Redundancy-Driven Top-$k$ Functional Dependency Discovery

Functional dependencies (FDs) are basic constraints in relational databases and are used for many data management tasks. Most FD discovery algorithms find all valid dependencies, but this causes two problems. First, the computational cost…

Databases · Computer Science 2026-01-16 Xiaolong Wan , Xixian Han

Fast Discovery of Nested Dependencies on JSON Data

Functional and inclusion dependencies are the most widely used classes of data dependencies in data profiling due to their ability to identify relationships in data such as primary and foreign keys. These relationships are equally important…

Databases · Computer Science 2021-11-23 Michael J. Mior

Learning Functional Dependencies with Sparse Regression

We study the problem of discovering functional dependencies (FD) from a noisy dataset. We focus on FDs that correspond to statistical dependencies in a dataset and draw connections between FD discovery and structure learning in…

Databases · Computer Science 2019-05-07 Zhihan Guo , Theodoros Rekatsinas

Measuring and Predicting the Quality of a Join for Data Discovery

We study the problem of discovering joinable datasets at scale. We approach the problem from a learning perspective relying on profiles. These are succinct representations that capture the underlying characteristics of the schemata and data…

Databases · Computer Science 2023-06-01 Sergi Nadal , Raquel Panadero , Javier Flores , Oscar Romero

Scalable Data Discovery Using Profiles

We study the problem of discovering joinable datasets at scale. This is, how to automatically discover pairs of attributes in a massive collection of independent, heterogeneous datasets that can be joined. Exact (e.g., based on distinct…

Databases · Computer Science 2020-12-07 Javier Flores , Sergi Nadal , Oscar Romero

Data-Driven Discovery of PDEs via the Adjoint Method

In this work, we present an adjoint-based method for discovering the underlying governing partial differential equations (PDEs) given data. The idea is to consider a parameterized PDE in a general form and formulate a PDE-constrained…

Optimization and Control · Mathematics 2025-09-23 Mohsen Sadr , Tony Tohme , Kamal Youcef-Toumi

Efficient Join Processing Over Incomplete Data Streams (Technical Report)

For decades, the join operator over fast data streams has always drawn much attention from the database community, due to its wide spectrum of real-world applications, such as online clustering, intrusion detection, sensor data monitoring,…

Databases · Computer Science 2019-08-26 Weilong Ren , Xiang Lian , Kambiz Ghazinour

The Complexity of Dependency Detection and Discovery in Relational Databases

Multi-column dependencies in relational databases come associated with two different computational tasks. The detection problem is to decide whether a dependency of a certain type and size holds in a given database, the discovery problem…

Data Structures and Algorithms · Computer Science 2021-03-25 Thomas Bläsius , Tobias Friedrich , Martin Schirneck

Extending Databases to Support Data Manipulation with Functional Dependencies: a Vision Paper

In the current paper, we propose to fuse together stored data (tables) and their functional dependencies (FDs) inside a DBMS. We aim to make FDs first-class citizens: objects which can be queried and used to query data. Our idea is to allow…

Databases · Computer Science 2020-05-19 Nikita Bobrov , Kirill Smirnov , George Chernishev

Discovery of Approximate Differential Dependencies

Differential dependencies (DDs) capture the relationships between data columns of relations. They are more general than functional dependencies (FDs) and and the difference is that DDs are defined on the distances between values of two…

Databases · Computer Science 2013-09-17 Jixue Liu , Selasi Kwashie , Jiuyong Li , Feiyue Ye , Millist Vincent

Discovery of Paradigm Dependencies

Missing and incorrect values often cause serious consequences. To deal with these data quality problems, a class of common employed tools are dependency rules, such as Functional Dependencies (FDs), Conditional Functional Dependencies…

Databases · Computer Science 2017-10-10 Jizhou Sun , Jianzhong Li , Hong Gao

Efficiently Estimating Mutual Information Between Attributes Across Tables

Relational data augmentation is a powerful technique for enhancing data analytics and improving machine learning models by incorporating columns from external datasets. However, it is challenging to efficiently discover relevant external…

Databases · Computer Science 2025-03-06 Aécio Santos , Flip Korn , Juliana Freire

Model Joins: Enabling Analytics Over Joins of Absent Big Tables

This work is motivated by two key facts. First, it is highly desirable to be able to learn and perform knowledge discovery and analytics (LKD) tasks without the need to access raw-data tables. This may be due to organizations finding it…

Databases · Computer Science 2022-06-22 Ali Mohammadi Shanghooshabad , Peter Triantafillou

Functional Dependencies Unleashed for Scalable Data Exchange

We address the problem of efficiently evaluating target functional dependencies (fds) in the Data Exchange (DE) process. Target fds naturally occur in many DE scenarios, including the ones in Life Sciences in which multiple source relations…

Databases · Computer Science 2016-04-19 Angela Bonifati , Ioana Ileana , Michele Linardi

QJoin: Transformation-aware Joinable Data Discovery Using Reinforcement Learning

Discovering which tables in large, heterogeneous repositories can be joined and by what transformations is a central challenge in data integration and data discovery. Traditional join discovery methods are largely designed for equi-joins,…

Databases · Computer Science 2025-12-03 Ning Wang , Sainyam Galhotra

Discovering Matching Dependencies

The concept of matching dependencies (mds) is recently pro- posed for specifying matching rules for object identification. Similar to the functional dependencies (with conditions), mds can also be applied to various data quality…

Databases · Computer Science 2009-06-13 Shaoxu Song , Lei Chen

Featurized-Decomposition Join: Low-Cost Semantic Joins with Guarantees

Large Language Models (LLMs) are being increasingly used within data systems to process large datasets with text fields. A broad class of such tasks involves a semantic join-joining two tables based on a natural language predicate per pair…

Databases · Computer Science 2025-12-08 Sepanta Zeighami , Shreya Shankar , Aditya Parameswaran

Functional Information Decomposition: A First-Principles Approach to Analyzing Functional Relationships

A central challenge in analyzing multivariate interactions within complex systems is to decompose how multiple inputs jointly determine an output. Existing approaches generally operate on observed probability distributions and can conflate…

Information Theory · Computer Science 2026-03-19 Clifford Bohm , Vincent R. Ragusa , Arend Hintze , Charles Ofria , Emily Dolson , Christoph Adami