Related papers: Code Code Evolution: Understanding How People Chan…
Artificial intelligence offers powerful new tools for scientific discovery, but the interaction paradigms required to effectively harness these systems remain underexplored. In this paper, we present findings from a formative user study…
Reusing and making sense of other scientists' computational notebooks. However, making sense of existing notebooks is a struggle, as these reference notebooks are often exploratory, have messy structures, include multiple alternatives, and…
It is important for researchers to understand precisely how data scientists turn raw data into insights, including typical programming patterns, workflow, and methodology. This paper contributes a novel system, called DataInquirer, that…
The sharing and reuse of data are seen as critical to solving the most complex problems of today. Despite this potential, relatively little is known about a key step in data reuse: people's behaviours involved in data-centric sensemaking.…
In software engineering, numerous studies have focused on the analysis of fine-grained logs, leading to significant innovations in areas such as refactoring, security, and code completion. However, no similar studies have been conducted for…
Aiming to help people conduct online research tasks, much research has gone into tools for searching for, collecting, organizing, and synthesizing online information. However, outside of the lab, in-the-wild sensemaking sessions (with data…
There is a gap between how people explore data and how Jupyter-like computational notebooks are designed. People explore data nonlinearly, using execution undos, branching, and/or complete reverts, whereas notebooks are designed for…
Computational notebooks are the primary coding tools for data scientists, but their code quality remains understudied and often poor. Given the importance of maintainability and reusability, enhancing code understandability is essential.…
Sensemaking is the cognitive process of extracting information, creating schemata from knowledge, making decisions from those schemata, and inferring conclusions. Human analysts are essential to exploring and quantifying this process, but…
Saving, or checkpointing, intermediate results during interactive data exploration can potentially boost user productivity. However, existing studies on this topic are limited, as they primarily rely on small-scale experiments with human…
Data wrangling, the process of preparing raw data for further analysis in computational notebooks, is a crucial yet time-consuming step in data science. Code generation has the potential to automate the data wrangling process to reduce…
Notebooks provide an interactive environment for programmers to develop code, analyse data and inject interleaved visualizations in a single environment. Despite their flexibility, a major pitfall that data scientists encounter is…
Exploratory Data Analysis (EDA) is a routine task for data analysts, often conducted using flexible computational notebooks. During EDA, data workers process, visualize, and interpret data tables, making decisions about subsequent analysis.…
The massive trend of integrating data-driven AI capabilities into traditional software systems is rising new intriguing challenges. One of such challenges is achieving a smooth transition from the explorative phase of Machine Learning…
With the goal of identifying common practices in data science projects, this paper proposes a framework for logging and understanding incremental code executions in Jupyter notebooks. This framework aims to allow reasoning about how…
Computational notebooks, which integrate code, documentation, tags, and visualizations into a single document, have become increasingly popular for data analysis tasks. With the advent of immersive technologies, these notebooks have evolved…
Code search is an important and frequent activity for developers using computational notebooks (e.g., Jupyter). The flexibility of notebooks brings challenges for effective code search, where classic search interfaces for traditional…
We offer a new model of the sensemaking process for data analysis and visualization. Whereas past sensemaking models have been grounded in positivist assumptions about the nature of knowledge, we reframe data sensemaking in critical,…
Sensemaking and narrative are two inherently interconnected concepts about how people understand the world around them. Sensemaking is the process by which people structure and interconnect the information they encounter in the world with…
Measuring code understandability is both highly relevant and exceptionally challenging. This paper proposes a dynamic code understandability assessment method, which estimates a personalized code understandability score from the perspective…