English
Related papers

Related papers: Data Consistency Approach to Model Validation

200 papers

Statistical modeling plays a fundamental role in understanding the underlying mechanism of massive data (statistical inference) and predicting the future (statistical prediction). Although all models are wrong, researchers try their best to…

Methodology · Statistics 2020-06-17 Hangjin Jiang

How can we draw trustworthy scientific conclusions? One criterion is that a study can be replicated by independent teams. While replication is critically important, it is arguably insufficient. If a study is biased for some reason and other…

Methodology · Statistics 2025-02-06 Yujin Jeong , Dominik Rothenhäusler

The validation of a data-driven model is the process of assessing the model's ability to generalize to new, unseen data in the population of interest. This paper proposes a set of general rules for model validation. These rules are designed…

Methodology · Statistics 2026-01-30 José Camacho

The choice of model class is fundamental in statistical learning and system identification, no matter whether the class is derived from physical principles or is a generic black-box. We develop a method to evaluate the specified model class…

Machine Learning · Statistics 2017-12-20 Andreas Svensson , Dave Zachariah , Thomas B. Schön

We propose and study a general method for construction of consistent statistical tests on the basis of possibly indirect, corrupted, or partially available observations. The class of tests devised in the paper contains Neyman's smooth…

Statistics Theory · Mathematics 2017-09-22 Mikhail Langovoy

A subjective expected utility policy making centre, managing complex, dynamic systems, needs to draw on the expertise of a variety of disparate panels of experts and integrate this information coherently. To achieve this, diverse supporting…

Methodology · Statistics 2015-12-21 Jim Q. Smith , Martine J. Barons , Manuele Leonelli

Linear models are foundational tools in statistics and ubiquitous across the applied sciences. However, conventional statistical inference -- such as $t$-tests and $F$-tests -- are only valid at fixed sample sizes, making them unsuitable…

Methodology · Statistics 2025-07-08 Michael Lindon , Dae Woong Ham , Martin Tingley , Iavor Bojinov

Deep sequence models are receiving significant interest in current machine learning research. By representing probability distributions that are fit to data using maximum likelihood estimation, such models can model data on general…

Systems and Control · Electrical Eng. & Systems 2024-09-09 Kristian Løvland , Bjarne Grimstad , Lars Struen Imsland

In astronomy, there is an opportunity to enhance the practice of validating models through statistical techniques, specifically to account for measurement error uncertainties. While models are commonly used to describe observations, there…

Instrumentation and Methods for Astrophysics · Physics 2023-08-09 Fiorenzo Stoppa , Eric Cator , Gijs Nelemans

Prediction, where observed data is used to quantify uncertainty about a future observation, is a fundamental problem in statistics. Prediction sets with coverage probability guarantees are a common solution, but these do not provide…

Statistics Theory · Mathematics 2022-11-22 Leonardo Cella , Ryan Martin

A common approach to synthetic data is to sample from a fitted model. We show that under general assumptions, this approach results in a sample with inefficient estimators and whose joint distribution is inconsistent with the true…

Statistics Theory · Mathematics 2026-02-18 Jordan Awan , Zhanrui Cai

A great deal of effort has been devoted to reducing the risk of spurious scientific discoveries, from the use of sophisticated validation techniques, to deep statistical methods for controlling the false discovery rate in multiple…

Machine Learning · Computer Science 2016-03-03 Cynthia Dwork , Vitaly Feldman , Moritz Hardt , Toniann Pitassi , Omer Reingold , Aaron Roth

Vehicle models have a long history of research and as of today are able to model the involved physics in a reasonable manner. However, each new vehicle has its new characteristics or parameters. The identification of these is the main task…

Computational Engineering, Finance, and Science · Computer Science 2024-12-11 Nicola Henkelmann , Stephan Rhode , Johannes von Keler

Modern science increasingly relies on ever-growing observational datasets and automated inference pipelines, under the implicit belief that accumulating more data makes scientific conclusions more reliable. Here we show that this belief can…

Machine Learning · Computer Science 2026-02-06 Zhipeng Zhang , Kai Li

Many methods of estimating causal models do not provide estimates of confidence in the resulting model. In this work, a metric is proposed for validating the output of a causal model fit; the robustness of the model structure with resampled…

A fundamental question in causal inference is whether it is possible to reliably infer manipulation effects from observational data. There are a variety of senses of asymptotic reliability in the statistical literature, among which the most…

Artificial Intelligence · Computer Science 2012-12-12 Jiji Zhang , Peter L. Spirtes

Model selection and assessment with incomplete data pose challenges in addition to the ones encountered with complete data. There are two main reasons for this. First, many models describe characteristics of the complete data, in spite of…

Methodology · Statistics 2008-08-28 Geert Verbeke , Geert Molenberghs , Caroline Beunckens

A fundamental problem in the practice and teaching of data science is how to evaluate the quality of a given data analysis, which is different than the evaluation of the science or question underlying the data analysis. Previously, we…

Other Statistics · Statistics 2019-04-29 Stephanie C. Hicks , Roger D. Peng

When the data do not conform to the hypothesis of a known sampling-variance, the fitting of a constant to a set of measured values is a long debated problem. Given the data, fitting would require to find what measurand value is the most…

Data Analysis, Statistics and Probability · Physics 2020-07-21 Giovanni Mana , Enrico Massa , Maria Predescu

Much of scientific data is collected as randomized experiments intervening on some and observing other variables of interest. Quite often, a given phenomenon is investigated in several studies, and different sets of variables are involved…

Methodology · Statistics 2012-10-19 Antti Hyttinen , Frederick Eberhardt , Patrik O. Hoyer
‹ Prev 1 2 3 10 Next ›