Related papers: Total Empiricism: Learning from Data
Statistical hypothesis testing is the central method to demarcate scientific theories in both exploratory and inferential analyses. However, whether this method befits such purpose remains a matter of debate. Established approaches to…
Testing hypotheses is an issue of primary importance in the scientific research, as well as in many other human activities. Much clarification about it can be achieved if the process of learning from data is framed in a stochastic model of…
Statistics has moved beyond the frequentist-Bayesian controversies of the past. Where does this leave our ability to interpret results? I suggest that a philosophy compatible with statistical practice, labeled here statistical pragmatism,…
A growing body of literature attempts to learn about contagion using observational (i.e. non-experimental) data collected from a single social network. While the conclusions of these studies may be correct, the methods rely on assumptions…
Stochastic processes offer a flexible mathematical formalism to model and reason about systems. Most analysis tools, however, start from the premises that models are fully specified, so that any parameters controlling the system's dynamics…
Predictive inference is a fundamental task in statistics, traditionally addressed using parametric assumptions about the data distribution and detailed analyses of how models learn from data. In recent years, conformal prediction has…
Conventional statistics begins with a model, and assigns a likelihood of obtaining any particular set of data. The opposite approach, beginning with the data and assigning a likelihood to any particular model, is explored here for the case…
The study of associations and their causal explanations is a central research activity whose methodology varies tremendously across fields. Even within specialized subfields, comparisons across textbooks and journals reveals that the basics…
Statistical pragmatism embraces all efficient methods in statistical inference. Augmentation of the collected data is used herein to obtain representative population information from a large class of non-representative population's units.…
Traditional statistical inference considers relatively small data sets and the corresponding theoretical analysis focuses on the asymptotic behavior of a statistical estimator when the number of samples approaches infinity. However, many…
Comparisons of different treatments or production processes are the goals of a significant fraction of applied research. Unsurprisingly, two-sample problems play a main role in Statistics through natural questions such as `Is the the new…
Statistical samples, in order to be representative, have to be drawn from a population in a random and unbiased way. Nevertheless, it is common practice in the field of model-based diagnosis to make estimations from (biased) best-first…
Attempts to replicate probabilistic reasoning in expert systems have typically overlooked a critical ingredient of that process. Probabilistic analysis typically requires extensive judgments regarding interdependencies among hypotheses and…
In this paper, we investigate the problem of assessing statistical methods and effectively summarizing results from simulations. Specifically, we consider problems of the type where multiple methods are compared on a reasonably large test…
This paper discusses the fundamental principles of causal inference - the area of statistics that estimates the effect of specific occurrences, treatments, interventions, and exposures on a given outcome from experimental and observational…
Several tasks in information retrieval (IR) rely on assumptions regarding the distribution of some property (such as term frequency) in the data being processed. This thesis argues that such distributional assumptions can lead to incorrect…
Inference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the resulting…
Randomized experiments have long been the gold standard for scientists seeking to learn about cause and effect. When randomized experiments are infeasible, scientists often resort to observational studies, which are widely available and…
Statistical learning relies upon data sampled from a distribution, and we usually do not care what actually generated it in the first place. From the point of view of causal modeling, the structure of each distribution is induced by…
The large majority of inferences drawn in empirical political research follow from model-based associations (e.g. regression). Here, we articulate the benefits of predictive modeling as a complement to this approach. Predictive models aim…