Related papers: Understanding and Supporting Debugging Workflows i…
A multiverse analysis evaluates all combinations of "reasonable" analytic decisions to promote robustness and transparency, but can lead to a combinatorial explosion of analyses to compute. Long delays before assessing results prevent users…
Debugging non-deterministic programs on microcontrollers is notoriously challenging, especially when bugs manifest in unpredictable, input-dependent execution paths. A recent approach, called multiverse debugging, makes it easier to debug…
Debugging nondeterministic programs is inherently difficult, particularly in microcontroller environments where execution paths can diverge unpredictably due to external sensor inputs. Traditional debugging techniques often fail to capture…
As applications get developed, bugs inevitably get introduced. Often, it is unclear why a given code change introduced a given bug. To find this causal relation and more effectively debug, developers can leverage the existence of a previous…
To detect and fix bugs and security vulnerabilities, software companies use static analysis as part of the development process. However, static analysis code itself is also prone to bugs. To ensure a consistent level of precision, as…
When building statistical models for Bayesian data analysis tasks, required and optional iterative adjustments and different modelling choices can give rise to numerous candidate models. In particular, checks and evaluations throughout the…
While significant progress has been made in automating various aspects of software development through coding agents, there is still significant room for improvement in their bug fixing capabilities. Debugging and investigation of runtime…
Large language models (LLMs) have become central to modern AI workflows, powering applications from open-ended text generation to complex agent-based reasoning. However, debugging these models remains a persistent challenge due to their…
As software grows increasingly complex, the quantity and diversity of concerns to be addressed also rises. To answer this diversity of concerns, developers may end up using multiple programming languages in a single software project, a…
Through case studies, we demonstrate how multiverse analysis can strengthen the robustness and transparency of computational social science findings against alternative methodological decisions. We conduct multiverse analyses of three…
Scientists often use meta-analysis to characterize the impact of an intervention on some outcome of interest across a body of literature. However, threats to the utility and validity of meta-analytic estimates arise when scientists average…
When language models answer open-ended problems, they implicitly make hidden decisions that shape their outputs, leaving users with uncontextualized answers rather than a working map of the problem; drawing on multiverse analysis from…
Fully autonomous teams of LLM-powered AI agents are emerging that collaborate to perform complex tasks for users. What challenges do developers face when trying to build and debug these AI agent teams? In formative interviews with five AI…
Amid mounting concern about the reliability and credibility of machine learning research, we present a principled framework for making robust and generalizable claims: the multiverse analysis. Our framework builds upon the multiverse…
Although bibliometrics has become an essential tool in the evaluation of research performance, bibliometric analyses are sensitive to a range of methodological choices. Subtle choices in data selection, indicator construction, and modeling…
Multiverse analysis is an approach to data analysis in which all "reasonable" analytic decisions are evaluated in parallel and interpreted collectively, in order to foster robustness and transparency. However, specifying a multiverse is…
Determining whether a configurable software system has a performance bug or it was misconfigured is often challenging. While there are numerous debugging techniques that can support developers in this task, there is limited empirical…
Machine Learning (ML) is more than just training models, the whole workflow must be considered. Once deployed, a ML model needs to be watched and constantly supervised and debugged to guarantee its validity and robustness in unexpected…
Design of large software systems requires rigorous application of software engineering methods covering all phases of the software process. Debugging during the early design phases is extremely important, because late bug-fixes are…
As multi-agent systems powered by Large Language Models (LLMs) are increasingly adopted in real-world workflows, users with diverse technical backgrounds are now building and refining their own agentic processes. However, these systems can…