Related papers: SCoRE Sets: A Versatile Framework for Simultaneous…
The analysis of excursion sets in imaging data is essential to a wide range of scientific disciplines such as neuroimaging, climatology and cosmology. Despite growing literature, there is little published concerning the comparison of…
In scientific disciplines such as neuroimaging, climatology, and cosmology it is useful to study the uncertainty of excursion sets of imaging data. While the case of imaging data obtained from a single study condition has already been…
Scientific discovery increasingly requires learning on federated datasets, fed by streams from high-resolution instruments, that have extreme class imbalance. Current ML approaches either require impractical data aggregation or fail due to…
In deploying artificial intelligence (AI) models, selective prediction offers the option to abstain from making a prediction when uncertain about model quality. To fulfill its promise, it is crucial to enforce strict and precise error…
We introduce Sequential Neural Posterior Score Estimation (SNPSE), a score-based method for Bayesian inference in simulator-based models. Our method, inspired by the remarkable success of score-based methods in generative modelling,…
Multi-modal generative document parsing systems challenge traditional evaluation: unlike deterministic OCR or layout models, they often produce semantically correct yet structurally divergent outputs. Conventional metrics-CER, WER, IoU, or…
Hypothesis testing methods that do not rely on exact distribution assumptions have been emerging lately. The method of sign-perturbed sums (SPS) is capable of characterizing confidence regions with exact confidence levels for linear…
For a high-dimensional linear model with a finite number of covariates measured with error, we study statistical inference on the parameters associated with the error-prone covariates, and propose a new corrected decorrelated score test and…
Many scientific analyses require simultaneous comparison of multiple functionals of an unknown signal at once, calling for multidimensional confidence regions with guaranteed simultaneous frequentist under structural constraints (e.g.,…
We consider the problem of uncertainty assessment for low dimensional components in high dimensional models. Specifically, we propose a decorrelated score function to handle the impact of high dimensional nuisance parameters. We consider…
Large Language Models (LLMs) can achieve inflated scores on multiple-choice tasks by exploiting inherent biases in option positions or labels, rather than demonstrating genuine understanding. This study introduces SCOPE, an evaluation…
We introduce a new estimator SMUCE (simultaneous multiscale change-point estimator) for the change-point problem in exponential family regression. An unknown step function is estimated by minimizing the number of change-points over the…
Standard multiple testing procedures are designed to report a list of discoveries, or suspected false null hypotheses, given the hypotheses' p-values or test scores. Recently there has been a growing interest in enhancing such procedures by…
We generalize standard credal set models for imprecise probabilities to include higher order credal sets -- confidences about confidences. In doing so, we specify how an agent's higher order confidences (credal sets) update upon observing…
We propose confidence regions for the parameters of incomplete models with exact coverage of the true parameter in finite samples. Our confidence region inverts a test, which generalizes Monte Carlo tests to incomplete models. The test…
Spatial confounding poses a significant challenge in scientific studies involving spatial data, where unobserved spatial variables can influence both treatment and outcome, possibly leading to spurious associations. To address this problem,…
Data with multiple functional recordings at each observational unit are increasingly common in various fields including medical imaging and environmental sciences. To conduct inference for such observations, we develop a paired two-sample…
We develop a uniform inference theory for high-dimensional slope parameters in threshold regression models, allowing for either cross-sectional or time series data. We first establish oracle inequalities for prediction errors, and L1…
We find the asymptotic distribution of the multi-dimensional multi-scale and kernel estimators for high-frequency financial data with microstructure. Sampling times are allowed to be asynchronous and endogenous. In the process, we show that…
We propose a new system identification method, called Sign-Perturbed Sums (SPS), for constructing non-asymptotic confidence regions under mild statistical assumptions. SPS is introduced for linear regression models, including but not…