Related papers: Summarization and Classification of Non-Poisson Po…

For principled model fitting in mathematical biology

The mathematical models used to capture features of complex, biological systems are typically non-linear, meaning that there are no generally valid simple relationships between their outputs and the data that might be used to validate them.…

Quantitative Methods · Quantitative Biology 2014-04-23 Thomas House

Selecting fitted models under epistemic uncertainty using a stochastic process on quantile functions

Fitting models to data is an important part of the practice of science. Advances in machine learning have made it possible to fit more -- and more complex -- models, but have also exacerbated a problem: when multiple models fit the data…

Methodology · Statistics 2025-10-27 Alexandre René , André Longtin

A Non-Intrusive Correction Algorithm for Classification Problems with Corrupted Data

A novel correction algorithm is proposed for multi-class classification problems with corrupted training data. The algorithm is non-intrusive, in the sense that it post-processes a trained classification model by adding a correction…

Machine Learning · Computer Science 2020-02-13 Jun Hou , Tong Qin , Kailiang Wu , Dongbin Xiu

Changepoint Detection in Complex Models: Cross-Fitting Is Needed

Changepoint detection is commonly formulated by minimizing the sum of in-sample losses to quantify the model's overall fit. However, for flexible modeling procedures -- especially those involving high-dimensional parameter spaces or…

Methodology · Statistics 2026-05-05 Chengde Qian , Guanghui Wang , Zhaojun Wang , Changliang Zou

Fitting and Analysis Technique for Inconsistent Nuclear Data

Consistent experiment data are crucial to adjust parameters of physics models and to determine best estimates of observables. However, often experiment data are not consistent due to unrecognized systematic errors. Standard methods of…

Nuclear Theory · Physics 2018-03-05 Georg Schnabel

The Least Wrong Model Is Not in the Data

The true process that generated data cannot be determined when multiple explanations are possible. Prediction requires a model of the probability that a process, chosen randomly from the set of candidate explanations, generates some future…

Machine Learning · Computer Science 2014-04-18 Oscar Stiffelman

A Goodness-of-Fit Test for Statistical Models

Statistical modeling plays a fundamental role in understanding the underlying mechanism of massive data (statistical inference) and predicting the future (statistical prediction). Although all models are wrong, researchers try their best to…

Methodology · Statistics 2020-06-17 Hangjin Jiang

Is completeness necessary? Estimation in nonidentified linear models

Modern data analysis depends increasingly on estimating models via flexible high-dimensional or nonparametric machine learning methods, where the identification of structural parameters is often challenging and untestable. In linear…

Statistics Theory · Mathematics 2026-01-21 Andrii Babii , Jean-Pierre Florens

Certain and Approximately Certain Models for Statistical Learning

Real-world data is often incomplete and contains missing values. To train accurate models over real-world datasets, users need to spend a substantial amount of time and resources imputing and finding proper values for missing data items. In…

Machine Learning · Statistics 2024-03-05 Cheng Zhen , Nischal Aryal , Arash Termehchy , Alireza Aghasi , Amandeep Singh Chabada

An Introduction to the Calibration of Computer Models

In the context of computer models, calibration is the process of estimating unknown simulator parameters from observational data. Calibration is variously referred to as model fitting, parameter estimation/inference, an inverse problem, and…

Methodology · Statistics 2023-10-16 Richard D. Wilkinson , Christopher W. Lanyon

Statistical Inference for Disordered Sphere Packings

Sphere packings are essential to the development of physical models for powders, composite materials, and the atomic structure of the liquid state. There is a strong scientific need to be able to assess the fit of packing models to data,…

Methodology · Statistics 2009-10-31 Jeffrey Picka

Testing for Overfitting

High complexity models are notorious in machine learning for overfitting, a phenomenon in which models well represent data but fail to generalize an underlying data generating process. A typical procedure for circumventing overfitting…

Machine Learning · Statistics 2025-03-11 James Schmidt

Tailoring Machine Learning for Process Mining

Machine learning models are routinely integrated into process mining pipelines to carry out tasks like data transformation, noise reduction, anomaly detection, classification, and prediction. Often, the design of such models is based on…

Machine Learning · Computer Science 2024-02-21 Paolo Ceravolo , Sylvio Barbon Junior , Ernesto Damiani , Wil van der Aalst

Model-Based Multiple Instance Learning

While Multiple Instance (MI) data are point patterns -- sets or multi-sets of unordered points -- appropriate statistical point pattern models have not been used in MI learning. This article proposes a framework for model-based MI learning…

Machine Learning · Statistics 2017-08-15 Ba-Ngu Vo , Dinh Phung , Quang N. Tran , Ba-Tuong Vo

Nonparametric Bayes modeling of count processes

Data on count processes arise in a variety of applications, including longitudinal, spatial and imaging studies measuring count responses. The literature on statistical models for dependent count data is dominated by models built from…

Methodology · Statistics 2013-10-08 Antonio Canale , David B. Dunson

Calibration diagnostics for point process models via the probability integral transform

We propose the use of the probability integral transform (PIT) for model validation in point process models. The simple PIT diagnostics assess the calibration of the model and can detect inconsistencies in both the intensity and the…

Methodology · Statistics 2013-05-16 Thordis L. Thorarinsdottir

Statistical matching of non-Gaussian data

The statistical matching problem is a data integration problem with structured missing data. The general form involves the analysis of multiple datasets that only have a strict subset of variables jointly observed across all datasets. The…

Methodology · Statistics 2019-04-01 Daniel Ahfock , Saumyadipta Pyne , Geoffrey J. McLachlan

Classification and Bayesian Optimization for Likelihood-Free Inference

Some statistical models are specified via a data generating process for which the likelihood function cannot be computed in closed form. Standard likelihood-based inference is then not feasible but the model parameters can be inferred by…

Computation · Statistics 2015-02-20 Michael U. Gutmann , Jukka Corander , Ritabrata Dutta , Samuel Kaski

Feature Matching in Time Series Modeling

Using a time series model to mimic an observed time series has a long history. However, with regard to this objective, conventional estimation methods for discrete-time dynamical models are frequently found to be wanting. In fact, they are…

Statistics Theory · Mathematics 2015-03-19 Yingcun Xia , Howell Tong

Evaluating Nonlinear Decision Trees for Binary Classification Tasks with Other Existing Methods

Classification of datasets into two or more distinct classes is an important machine learning task. Many methods are able to classify binary classification tasks with a very high accuracy on test data, but cannot provide any easily…

Machine Learning · Computer Science 2020-08-26 Yashesh Dhebar , Sparsh Gupta , Kalyanmoy Deb