Mathematics
We address the problem of observation noise misspecification in Bayesian filtering of dynamical systems via recent advances in generalised Bayesian inference. Mis-match in tail decay between the true data generating process and an assumed…
Sampling is a fundamental algorithmic task in wide-ranging applications across multiple disciplines such as scientific computing, statistics and machine learning. In this paper, an efficient stochastic Runge-Kutta scheme is proposed to…
Protesting mildly against the notion of an exactly correct parametric model the view is adopted that the logistic regression equation is merely an approximation to the underlying, true function. The behaviour of likelihood based estimators…
We investigate a geometric and distributional reinterpretation of Chatterjee's $\xi$-coefficient, which measures functional dependence between a response variable $Y$ and a predictor vector $\mathbf{X}$. For this purpose, we analyze the…
We study the relation between the total variation (TV) and Hellinger distances between two Gaussian location mixtures. Our first result establishes a general upper bound: for any two mixing distributions supported on a compact set, the…
Discrete flow models offer a powerful framework for learning distributions over discrete state spaces and have demonstrated superior performance compared to the discrete diffusion models. However, their convergence properties and error…
We introduce a new formulation of structural causal models for extremes, called the extremal structural causal model (eSCM). Unlike conventional structural causal models, where randomness is governed by a probability distribution, eSCMs use…
The Kolmogorov-Smirnov statistic is usually introduced as a supremum, but its finite-sample behavior is governed by a more local question: where does the empirical process first cross a boundary? This letter gives a partial answer through a…
The following learning problem arises naturally in various applications: Given a finite sample from a categorical or count time series, can we learn a function of the sample that (nearly) maximizes the probability of correctly guessing the…
We study the mean-squared error of $k$-fold cross-validation as a risk estimator, with particular emphasis on how its accuracy depends on the number of folds $k$. Despite the widespread use of cross-validation, principled guidance for…
The classical tail dependence coefficient (TDC) may fail to capture non-exchangeable features of bivariate tail dependence since it evaluates the underlying copula only along the diagonal. To address this limitation, several measures of…
We derive confidence intervals and confidence sequences for causal effects in situations where the back-door or front-door criteria are applicable. Our tightest confidence intervals hold in the standard setting where the training data…
The importance of functional data analysis has increased substantially in recent years. In machine learning, nonlinear function regression based on deep neural networks is referred to as operator learning, and many of its applications…
We propose a quasi maximum likelihood estimation method for Bergomi-type stochastic volatility models with parametrized kernels, focusing on the estimation of the kernel parameters from high-frequency time-series observations of option…
This paper discusses digital online mathematics examinations -- a discussion ranging from high school to university level examinations. In particular, we consider the nature of mathematical writing, what is distinctive about mathematical…
We trace a conceptual genealogy from Abraham de Moivre's derivation of the normal curve (1733) to the modern distributional approach to statistics. De Moivre's Approximatio ad Summam Terminorum Binomii gave the first systematic derivation…
This paper introduces the Quantum Covariance Embedding, which embeds Positive Operator-Valued Measures into a tensor product of a Reproducing Kernel Hilbert Space and the quantum state space via a tensorized Bochner integral. This…
We study the sample complexity of robust binary hypothesis testing under three standard contamination models: $\varepsilon$-additive (Huber), $\varepsilon$-subtractive, and $\varepsilon$-total variation (TV), denoted by…
Identifying the most influential nodes in a network, typically using centrality measures, is a central task in applied network analysis. However, real-world networks are often constructed from noisy or incomplete data, which can distort…
We study the generalized dynamic factor model in a long-memory setting. Unlike most recent work, which assumes a finite-dimensional factor space and short memory, our framework allows the factor space to be infinite-dimensional and the…