Related papers: FEDEX: An Explainability Framework for Data Explor…

Selecting Sub-tables for Data Exploration

We present a framework for creating small, informative sub-tables of large data tables to facilitate the first step of data science: data exploration. Given a large data table table T, the goal is to create a sub-table of small, fixed…

Databases · Computer Science 2022-03-08 Kathy Razmadze , Yael Amsterdamer , Amit Somech , Susan B. Davidson , Tova Milo

Cube Interestingness: Novelty, Relevance, Peculiarity and Surprise

In this paper, we discuss methods to assess the interestingness of a query in an environment of data cubes. We assume a hierarchical multidimensional database, storing data cubes and level hierarchies. We start with a comprehensive review…

Databases · Computer Science 2023-07-27 Dimos Gkitsakis , Spyridon Kaloudis , Eirini Mouselli , Veronika Peralta , Patrick Marcel , Panos Vassiliadis

Interactive Data Exploration with Smart Drill-Down

We present {\em smart drill-down}, an operator for interactively exploring a relational table to discover and summarize "interesting" groups of tuples. Each group of tuples is described by a {\em rule}. For instance, the rule $(a, b, \star,…

Databases · Computer Science 2016-12-20 Manas Joglekar , Hector Garcia-Molina , Aditya Parameswaran

A Bayesian Network Model for Interesting Itemsets

Mining itemsets that are the most interesting under a statistical model of the underlying data is a commonly used and well-studied technique for exploratory data analysis, with the most recent interestingness models exhibiting state of the…

Machine Learning · Statistics 2016-11-14 Jaroslav Fowkes , Charles Sutton

Putting Things into Context: Rich Explanations for Query Answers using Join Graphs (extended version)

In many data analysis applications, there is a need to explain why a surprising or interesting result was produced by a query. Previous approaches to explaining results have directly or indirectly used data provenance (input tuples…

Databases · Computer Science 2021-03-30 Chenjie Li , Zhengjie Miao , Qitian Zeng , Boris Glavic , Sudeepa Roy

Towards Semantically Enhanced Data Understanding

In the field of machine learning, data understanding is the practice of getting initial insights in unknown datasets. Such knowledge-intensive tasks require a lot of documentation, which is necessary for data scientists to grasp the meaning…

Databases · Computer Science 2018-06-14 Markus Schröder , Christian Jilek , Jörn Hees , Andreas Dengel

Efficient Exploration of Interesting Aggregates in RDF Graphs

As large Open Data are increasingly shared as RDF graphs today, there is a growing demand to help users discover the most interesting facets of a graph, which are often hard to grasp without automatic tools. We consider the problem of…

Databases · Computer Science 2021-04-01 Yanlei Diao , Paweł Guzewicz , Ioana Manolescu , Mirjana Mazuran

Explainability Fact Sheets: A Framework for Systematic Assessment of Explainable Approaches

Explanations in Machine Learning come in many forms, but a consensus regarding their desired properties is yet to emerge. In this paper we introduce a taxonomy and a set of descriptors that can be used to characterise and systematically…

Machine Learning · Computer Science 2019-12-12 Kacper Sokol , Peter Flach

Interestingness First Classifiers

Most machine learning models are designed to maximize predictive accuracy. In this work, we explore a different goal: building classifiers that are interesting. An ``interesting classifier'' is one that uses unusual or unexpected features,…

Machine Learning · Computer Science 2025-08-28 Ryoma Sato

Scientific Dataset Discovery via Topic-level Recommendation

Data intensive research requires the support of appropriate datasets. However, it is often time-consuming to discover usable datasets matching a specific research topic. We formulate the dataset discovery problem on an attributed…

Information Retrieval · Computer Science 2021-06-08 Basmah Altaf , Shichao Pei , Xiangliang Zhang

Explaining Documents' Relevance to Search Queries

We present GenEx, a generative model to explain search results to users beyond just showing matches between query and document words. Adding GenEx explanations to search results greatly impacts user satisfaction and search performance.…

Information Retrieval · Computer Science 2021-11-03 Razieh Rahimi , Youngwoo Kim , Hamed Zamani , James Allan

Notable Characteristics Search through Knowledge Graphs

Query answering routinely employs knowledge graphs to assist the user in the search process. Given a knowledge graph that represents entities and relationships among them, one aims at complementing the search with intuitive but effective…

Databases · Computer Science 2018-02-13 Davide Mottin , Bastian Grasnick , Axel Kroschk , Patrick Siegler , Emmanuel Mueller

Untidy Data: The Unreasonable Effectiveness of Tables

Working with data in table form is usually considered a preparatory and tedious step in the sensemaking pipeline; a way of getting the data ready for more sophisticated visualization and analytical tools. But for many people, spreadsheets…

Human-Computer Interaction · Computer Science 2021-06-30 Lyn Bartram , Michael Correll , Melanie Tory

Explainable Product Search with a Dynamic Relation Embedding Model

Product search is one of the most popular methods for customers to discover products online. Most existing studies on product search focus on developing effective retrieval models that rank items by their likelihood to be purchased. They,…

Information Retrieval · Computer Science 2019-09-17 Qingyao Ai , Yongfeng Zhang , Keping Bi , W. Bruce Croft

Towards More Usable Dataset Search: From Query Characterization to Snippet Generation

Reusing published datasets on the Web is of great interest to researchers and developers. Their data needs may be met by submitting queries to a dataset search engine to retrieve relevant datasets. In this ongoing work towards developing a…

Information Retrieval · Computer Science 2019-08-30 Jinchi Chen , Xiaxia Wang , Gong Cheng , Evgeny Kharlamov , Yuzhong Qu

On Explaining Confounding Bias

When analyzing large datasets, analysts are often interested in the explanations for surprising or unexpected results produced by their queries. In this work, we focus on aggregate SQL queries that expose correlations in the data. A major…

Databases · Computer Science 2022-10-07 Brit Youngmann , Michael Cafarella , Yuval Moskovitch , Babak Salimi

Subjectively Interesting Subgroup Discovery on Real-valued Targets

Deriving insights from high-dimensional data is one of the core problems in data mining. The difficulty mainly stems from the fact that there are exponentially many variable combinations to potentially consider, and there are infinitely…

Machine Learning · Statistics 2021-11-08 Jefrey Lijffijt , Bo Kang , Wouter Duivesteijn , Kai Puolamäki , Emilia Oikarinen , Tijl De Bie

AIDE: An Automated Sample-based Approach for Interactive Data Exploration

In this paper, we argue that database systems be augmented with an automated data exploration service that methodically steers users through the data in a meaningful way. Such an automated system is crucial for deriving insights from…

Databases · Computer Science 2015-11-02 Kyriaki Dimitriadou , Olga Papaemmanouil , Yanlei Diao

Redescription Model Mining

This paper introduces Redescription Model Mining, a novel approach to identify interpretable patterns across two datasets that share only a subset of attributes and have no common instances. In particular, Redescription Model Mining aims to…

Databases · Computer Science 2021-07-12 Felix I. Stamm , Martin Becker , Markus Strohmaier , Florian Lemmerich

Improving Explanations: Applying the Feature Understandability Scale for Cost-Sensitive Feature Selection

With the growing pervasiveness of artificial intelligence, the ability to explain the inferences made by machine learning models has become increasingly important. Numerous techniques for model explainability have been proposed, with…

Human-Computer Interaction · Computer Science 2026-04-08 Nicola Rossberg , Bennett Kleinberg , Barry O'Sullivan , Luca Longo , Andrea Visentin