Related papers: Inferactive data analysis
Data analyses typically rely upon assumptions about missingness mechanisms that lead to observed versus missing data. When the data are missing not at random, direct assumptions about the missingness mechanism, and indirect assumptions…
Describing the causal relations governing a system is a fundamental task in many scientific fields, ideally addressed by experimental studies. However, obtaining data under intervention scenarios may not always be feasible, while…
In 1977 John Tukey described how in exploratory data analysis, data analysts use tools, such as data visualizations, to separate their expectations from what they observe. In contrast to statistical theory, an underappreciated aspect of…
This paper offers a comprehensive introduction to Bayesian inference, combining historical context, theoretical foundations, and core analytical examples. Beginning with Bayes' theorem and the philosophical distinctions between Bayesian and…
In this article, we review selective inference, a set of techniques for inference when the statistical question asked is a function of the data. This setting often arises in contemporary scientific workflows, where hypotheses and parameters…
Large-scale datasets are increasingly being used to inform decision making. While this effort aims to ground policy in real-world evidence, challenges have arisen as selection bias and other forms of distribution shifts often plague…
Statistical inference as a formal scientific method to covert experience to knowledge has proven to be elusively difficult. While frequentist and Bayesian methodologies have been accepted in the contemporary era as two dominant schools of…
Statistical analysis is an important tool to distinguish systematic from chance findings. Current statistical analyses rely on distributional assumptions reflecting the structure of some underlying model, which if not met lead to problems…
Causal inference with observational data critically relies on untestable and extra-statistical assumptions that have (sometimes) testable implications. Well-known sets of assumptions that are sufficient to justify the causal interpretation…
In this paper, we propose a novel inference method for dynamic genetic networks which makes it possible to face with a number of time measurements n much smaller than the number of genes p. The approach is based on the concept of low order…
Data-carving methods perform selective inference by conditioning the distribution of data on the observed selection event. However, existing data-carving approaches typically require an analytically tractable characterization of the…
This chapter provides a overview of Bayesian inference, mostly emphasising that it is a universal method for summarising uncertainty and making estimates and predictions using probability statements conditional on observed data and an…
Generative AI has achieved remarkable empirical success, but from the perspective of statistics it often remains opaque: its predictions may be accurate, yet the underlying mechanism is difficult to interpret, analyze, and trust. This book…
Inference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the resulting…
Inspired by the concept of active learning, we propose active inference$\unicode{x2013}$a methodology for statistical inference with machine-learning-assisted data collection. Assuming a budget on the number of labels that can be collected,…
Causal discovery is crucial for understanding complex systems and informing decisions. While observational data can uncover causal relationships under certain assumptions, it often falls short, making active interventions necessary. Current…
It is not unusual for a data analyst to encounter data sets distributed across several computers. This can happen for reasons such as privacy concerns, efficiency of likelihood evaluations, or just the sheer size of the whole data set. This…
This dissertation focuses on modern causal inference under uncertainty and data restrictions, with applications to neoadjuvant clinical trials, distributed data networks, and robust individualized decision making. In the first project, we…
Inference tasks in signal processing are often characterized by the availability of reliable statistical modeling with some missing instance-specific parameters. One conventional approach uses data to estimate these missing parameters and…
We develop a mathematical and interpretative foundation for the enterprise of decision-theoretic statistical causality (DT), which is a straightforward way of representing and addressing causal questions. DT reframes causal inference as…