Related papers: DataSlicer: Task-Based Data Selection for Visual D…
Real-world machine learning models require rigorous evaluation before deployment, especially in safety-critical domains like autonomous driving and surveillance. The evaluation of machine learning models often focuses on data slices, which…
The increasingly rapid growth of data production and the consequent need to explore data to obtain answers to the most varied questions have promoted the development of tools to facilitate the manipulation and construction of data…
Data discovery from data lakes is an essential application in modern data science. While many previous studies focused on improving the efficiency and effectiveness of data discovery, little attention has been paid to the usability of such…
Selecting relevant data subsets from large, unfamiliar datasets can be difficult. We address this challenge by modeling and visualizing two kinds of auxiliary information: (1) quality - the validity and appropriateness of data required to…
One of the major challenges for evaluating the effectiveness of data visualizations and visual analytics tools arises from the fact that different users may be using these tools for different tasks. In this paper, we present a simple…
Choosing a suitable visualization for data is a difficult task. Current data visualization recommender systems exist to aid in choosing a visualization, yet suffer from issues such as low accessibility and indecisiveness. In this study, we…
General visualization recommendation systems typically make design decisions for the dataset automatically. However, most of them can only prune meaningless visualizations but fail to recommend targeted results. This paper contributes…
Automated slicing aims to identify subsets of evaluation data where a trained model performs anomalously. This is an important problem for machine learning pipelines in production since it plays a key role in model debugging and comparison,…
Recent work in vision-and-language demonstrates that large-scale pretraining can learn generalizable models that are efficiently transferable to downstream tasks. While this may improve dataset-scale aggregate metrics, analyzing performance…
Dataset Search -- the process of finding appropriate datasets for a given task -- remains a critical yet under-explored challenge in data science workflows. Assessing dataset suitability for a task (e.g., training a classification model) is…
Visualizations of tabular data are widely used; understanding their effectiveness in different task and data contexts is fundamental to scaling their impact. However, little is known about how basic tabular data visualizations perform…
Appropriate evaluation is a key component in visualization research. It is typically based on empirical studies that assess visualization components or complete systems. While such studies often include the user of the visualization,…
Efficient explorative data analysis systems must take into account both what a user knows and wants to know. This paper proposes a principled framework for interactive visual exploration of relations in data, through views most informative…
Data slice finding is an emerging technique for validating machine learning (ML) models by identifying and analyzing subgroups in a dataset that exhibit poor performance, often characterized by distinct feature sets or descriptive metadata.…
One of the most useful techniques to help visual data analysis systems is interactive filtering (brushing). However, visualization techniques often suffer from overlap of graphical items and multiple attributes complexity, making visual…
Machine learning models make mistakes, yet sometimes it is difficult to identify the systematic problems behind the mistakes. Practitioners engage in various activities, including error analysis, testing, auditing, and red-teaming, to form…
Selecting the appropriate visual presentation of the data such that it preserves the semantics of the underlying data and at the same time provides an intuitive summary of the data is an important, often the final step of data analytics.…
Recent advances in visual analytics have enabled us to learn from user interactions and uncover analytic goals. These innovations set the foundation for actively guiding users during data exploration. Providing such guidance will become…
Effective data analysis ideally requires the analyst to have high expertise as well as high knowledge of the data. Even with such familiarity, manually pursuing all potential hypotheses and exploring all possible views is impractical. We…
Program slicing has been widely applied in a variety of software engineering tasks. However, existing program slicing techniques only deal with traditional programs that are constructed with instructions and variables, rather than neural…