Related papers: Data set diagonalization in a global fit
There are many uses for linear fitting; the context here is interpolation and denoising of data, as when you have calibration data and you want to fit a smooth, flexible function to those data. Or you want to fit a flexible function to…
I discuss recent developments in the determination of parton distributions from global fits. I concentrate on the errors associated with these parton distributions and with the physical quantities which are determined in terms of them. I…
We address the goal of conducting inference about a smooth finite-dimensional parameter by utilizing individual-level data from various independent sources. Recent advancements have led to the development of a comprehensive theory capable…
The performance of machine learning models relies heavily on the quality of input data, yet real-world applications often face significant data-related challenges. A common issue arises when curating training data or deploying models: two…
The recently developed "Data Set Diagonalization" method (DSD) is applied to measure compatibility of the data sets that are used to determine parton distribution functions (PDFs). Discrepancies among the experiments are found to be…
We propose a new and rather stringent criterion for testing the goodness of fit between a theory and experiment. It is motivated by the paradox that the criterion on \chi^2 for testing a theory is much weaker than the criterion for finding…
A discussion is presented of the manner in which uncertainties in parton distributions and related quantities are determined. One of the central problems is the criteria used to judge what variation of the parameters describing a set of…
Statistical modeling plays a fundamental role in understanding the underlying mechanism of massive data (statistical inference) and predicting the future (statistical prediction). Although all models are wrong, researchers try their best to…
We discuss a goodness-of-fit method which tests the compatibility between statistically independent data sets. The method gives sensible results even in cases where the chi^2-minima of the individual data sets are very low or when several…
The advent of modern technology, permitting the measurement of thousands of characteristics simultaneously, has given rise to floods of data characterized by many large or even huge datasets. This new paradigm presents extraordinary…
Widespread use of the Internet and social networks invokes the generation of big data, which is proving to be useful in a number of applications. To deal with explosively growing amounts of data, data analytics has emerged as a critical…
Consistent experiment data are crucial to adjust parameters of physics models and to determine best estimates of observables. However, often experiment data are not consistent due to unrecognized systematic errors. Standard methods of…
We introduce the neural network approach to global fits of parton distribution functions. First we review previous work on unbiased parametrizations of deep-inelastic structure functions with faithful estimation of their uncertainties, and…
One of the major open problems in machine learning is to characterize generalization in the overparameterized regime, where most traditional generalization bounds become inconsistent even for overparameterized linear regression. In many…
In many astronomical problems one often needs to determine the upper and/or lower boundary of a given data set. An automatic and objective approach consists in fitting the data using a generalised least-squares method, where the function to…
When data do not conform to the hypothesis of a known sampling-variance, the fitting of a constant to the set of measured values is a long debated problem. Given the data, the fitting would require to find which measurand value is most…
When fitting theory to data in the presence of background uncertainties, the question of whether the spectral shape of the background happens to be similar to that of the theoretical model of physical interest has not generally been…
A goodness-of-fit test for the fitting of a parametric model to data obtained from a detector with finite resolution and limited acceptance is proposed. The parameters of the model are found by minimization of a statistic that is used for…
Domain generalization is the problem of machine learning when the training data and the test data come from different data domains. We present a simple theoretical model of learning to generalize across domains in which there is a…
We provide an analytical argument for understanding the likely nature of parameter shifts between those coming from an analysis of a dataset and from a subset of that dataset, assuming differences are down to noise and any intrinsic…