Related papers: Understanding parameter differences between analys…
The performance of machine learning models relies heavily on the quality of input data, yet real-world applications often face significant data-related challenges. A common issue arises when curating training data or deploying models: two…
In a statistical analysis in Particle Physics, nuisance parameters can be introduced to take into account various types of systematic uncertainties. The best estimate of such a parameter is often modeled as a Gaussian distributed variable…
Sensitivity analysis informs causal inference by assessing the sensitivity of conclusions to departures from assumptions. The consistency assumption states that there are no hidden versions of treatment and that the outcome arising…
The paper presents a construction of a quantitative measure of variability for parameter estimates in the data fitting problem under interval uncertainty. It shows the degree of variability and ambiguity of the estimate, and the need for…
The data revolution has led to an increased interest in the practice of data analysis. For a given problem, there can be significant or subtle differences in how a data analyst constructs or creates a data analysis, including differences in…
The sensitivities revealed by a sensitivity analysis of a probabilistic network typically depend on the entered evidence. For a real-life network therefore, the analysis is performed a number of times, with different evidence. Although…
The paper is concerned with inference for a parameter of interest in models that share a common interpretation for that parameter but that may differ appreciably in other respects. We study the general structure of models under which the…
High-throughput data analyses are becoming common in biology, communications, economics and sociology. The vast amounts of data are usually represented in the form of matrices and can be considered as knowledge networks. Spectra-based…
Measurement system analysis aims to quantify the variability in data attributable to the measurement system and evaluate its contribution to overall data variability. This paper conducts a rigorous theoretical investigation of the…
The study of associations and their causal explanations is a central research activity whose methodology varies tremendously across fields. Even within specialized subfields, comparisons across textbooks and journals reveals that the basics…
When the data do not conform to the hypothesis of a known sampling-variance, the fitting of a constant to a set of measured values is a long debated problem. Given the data, fitting would require to find what measurand value is the most…
We consider high-dimensional estimation problems where the number of parameters diverges with the sample size. General conditions are established for consistency, uniqueness, and asymptotic normality in both unpenalized and penalized…
Understanding causal relationships among the variables of a system is paramount to explain and control its behavior. For many real-world systems, however, the true causal graph is not readily available and one must resort to predictions…
Sampling errors in nested sampling parameter estimation differ from those in Bayesian evidence calculation, but have been little studied in the literature. This paper provides the first explanation of the two main sources of sampling errors…
The problem of detecting and quantifying the presence of symmetries in datasets is useful for model selection, generative modeling, and data analysis, amongst others. While existing methods for hard-coding transformations in neural networks…
Many existing approaches for estimating parameters in settings with distributional shifts operate under an invariance assumption. For example, under covariate shift, it is assumed that $p(y|x)$ remains invariant. We refer to such…
Evaluating a neural network on an input that differs markedly from the training data might cause erratic and flawed predictions. We study a method that judges the unusualness of an input by evaluating its informative content compared to the…
Interpreting data with mathematical models is an important aspect of real-world industrial and applied mathematical modeling. Often we are interested to understand the extent to which a particular set of data informs and constrains model…
For data segmentation in high-dimensional linear regression settings, the regression parameters are often assumed to be sparse segment-wise, which enables many existing methods to estimate the parameters locally via $\ell_1$-regularised…
Current practices in metric evaluation focus on one single dataset, e.g., Newstest dataset in each year's WMT Metrics Shared Task. However, in this paper, we qualitatively and quantitatively show that the performances of metrics are…