Related papers: Interactive Data Exploration with Smart Drill-Down

Selecting Sub-tables for Data Exploration

We present a framework for creating small, informative sub-tables of large data tables to facilitate the first step of data science: data exploration. Given a large data table table T, the goal is to create a sub-table of small, fixed…

Databases · Computer Science 2022-03-08 Kathy Razmadze , Yael Amsterdamer , Amit Somech , Susan B. Davidson , Tova Milo

Interactive Set Discovery

We study the problem of set discovery where given a few example tuples of a desired set, we want to find the set in a collection of sets. A challenge is that the example tuples may not uniquely identify a set, and a large number of…

Databases · Computer Science 2022-10-05 Arif Hasnat , Davood Rafiei

FEDEX: An Explainability Framework for Data Exploration Steps

When exploring a new dataset, Data Scientists often apply analysis queries, look for insights in the resulting dataframe, and repeat to apply further queries. We propose in this paper a novel solution that assists data scientists in this…

Databases · Computer Science 2022-09-15 Daniel Deutch , Amir Gilad , Tova Milo , Amit Mualem , Amit Somech

Intelligent Drill-Down: Large Language Model-Driven Drill-Down Technique for Human-AI Collaborative Visual Exploration

In visual analytics, applying filters to drill-down and extract higher-value insights is a common and important data analysis method. When the drill-down space becomes excessively large, analysts may lose orientation, leading to decreased…

Human-Computer Interaction · Computer Science 2026-04-21 Zhijun Zheng , Tian Qiu , Yuheng Zhao , Siming Chen

Guided Visual Exploration of Relations in Data Sets

Efficient explorative data analysis systems must take into account both what a user knows and wants to know. This paper proposes a principled framework for interactive visual exploration of relations in data, through views most informative…

Machine Learning · Statistics 2021-07-02 Kai Puolamäki , Emilia Oikarinen , Andreas Henelius

Diverse Unionable Tuple Search: Novelty-Driven Discovery in Data Lakes [Technical Report]

Unionable table search techniques input a query table from a user and search for data lake tables that can contribute additional rows to the query table. The definition of unionability is generally based on similarity measures which may…

Databases · Computer Science 2025-09-03 Aamod Khatiwada , Roee Shraga , Renée J. Miller

Beyond Roll-Up's and Drill-Down's: An Intentional Analytics Model to Reinvent OLAP (long-version)

This paper structures a novel vision for OLAP by fundamentally redefining several of the pillars on which OLAP has been based for the last 20 years. We redefine OLAP queries, in order to move to higher degrees of abstraction from roll-up's…

Databases · Computer Science 2020-12-10 Panos Vassiliadis , Patrick Marcel , Stefano Rizzi

Mining Multi-Level Frequent Itemsets under Constraints

Mining association rules is a task of data mining, which extracts knowledge in the form of significant implication relation of useful items (objects) from a database. Mining multilevel association rules uses concept hierarchies, also called…

Databases · Computer Science 2010-12-30 Mohamed Salah Gouider , Amine Farhat

Selective association rule generation

Mining association rules is a popular and well researched method for discovering interesting relations between variables in large databases. A practical problem is that at medium to low support values often a large number of frequent…

Databases · Computer Science 2008-12-18 Michael Hahsler , Christian Buchta , Kurt Hornik

Interactive Summarization and Exploration of Top Aggregate Query Answers

We present a system for summarization and interactive exploration of high-valued aggregate query answers to make a large set of possible answers more informative to the user. Our system outputs a set of clusters on the high-valued query…

Databases · Computer Science 2018-08-01 Yuhao Wen , Xiaodan Zhu , Sudeepa Roy , Jun Yang

Subjectively Interesting Subgroup Discovery on Real-valued Targets

Deriving insights from high-dimensional data is one of the core problems in data mining. The difficulty mainly stems from the fact that there are exponentially many variable combinations to potentially consider, and there are infinitely…

Machine Learning · Statistics 2021-11-08 Jefrey Lijffijt , Bo Kang , Wouter Duivesteijn , Kai Puolamäki , Emilia Oikarinen , Tijl De Bie

A Lightweight Algorithm to Uncover Deep Relationships in Data Tables

Many data we collect today are in tabular form, with rows as records and columns as attributes associated with each record. Understanding the structural relationship in tabular data can greatly facilitate the data science process.…

Data Structures and Algorithms · Computer Science 2020-09-09 Jin Cao , Yibo Zhao , Linjun Zhang , Jason Li

Divisi: Interactive Search and Visualization for Scalable Exploratory Subgroup Analysis

Analyzing data subgroups is a common data science task to build intuition about a dataset and identify areas to improve model performance. However, subgroup analysis is prohibitively difficult in datasets with many features, and existing…

Human-Computer Interaction · Computer Science 2025-02-18 Venkatesh Sivaraman , Zexuan Li , Adam Perer

ExClus: Explainable Clustering on Low-dimensional Data Representations

Dimensionality reduction and clustering techniques are frequently used to analyze complex data sets, but their results are often not easy to interpret. We consider how to support users in interpreting apparent cluster structure on scatter…

Machine Learning · Computer Science 2021-11-08 Xander Vankwikelberge , Bo Kang , Edith Heiter , Jefrey Lijffijt

Learn to Explore: on Bootstrapping Interactive Data Exploration with Meta-learning

Interactive data exploration (IDE) is an effective way of comprehending big data, whose volume and complexity are beyond human abilities. The main goal of IDE is to discover user interest regions from a database through multi-rounds of user…

Databases · Computer Science 2023-01-03 Yukun Cao , Xike Xie , Kexin Huang

Interpretable Rule Discovery Through Bilevel Optimization of Split-Rules of Nonlinear Decision Trees for Classification Problems

For supervised classification problems involving design, control, other practical purposes, users are not only interested in finding a highly accurate classifier, but they also demand that the obtained classifier be easily interpretable.…

Machine Learning · Computer Science 2020-08-04 Yashesh Dhebar , Kalyanmoy Deb

Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking

Large databases are often organized by hand-labeled metadata, or criteria, which are expensive to collect. We can use unsupervised learning to model database variation, but these models are often high dimensional, complex to parameterize,…

Computer Vision and Pattern Recognition · Computer Science 2017-06-14 James Tompkin , Kwang In Kim , Hanspeter Pfister , Christian Theobalt

Scalable Sampling for High Utility Patterns

Discovering valuable insights from data through meaningful associations is a crucial task. However, it becomes challenging when trying to identify representative patterns in quantitative databases, especially with large datasets, as…

Databases · Computer Science 2024-10-31 Lamine Diop , Marc Plantevit

SmartTable: A Spreadsheet Program with Intelligent Assistance

We introduce SmartTable, an online spreadsheet application that is equipped with intelligent assistance capabilities. With a focus on relational tables, describing entities along with their attributes, we offer assistance in two flavors:…

Information Retrieval · Computer Science 2018-05-17 Shuo Zhang , Vugar Abdul Zada , Krisztian Balog

Optimal Algorithms for Crawling a Hidden Database in the Web

A hidden database refers to a dataset that an organization makes accessible on the web by allowing users to issue queries through a search interface. In other words, data acquisition from such a source is not by following static…

Databases · Computer Science 2012-08-02 Cheng Sheng , Nan Zhang , Yufei Tao , Xin Jin