统计理论 — Scifaro

Dynamic Structural Causal Models

We study a specific type of SCM, called a Dynamic Structural Causal Model (DSCM), whose endogenous variables represent functions of time, which is possibly cyclic and allows for latent confounding. As a motivating use-case, we show that…

统计理论 · 数学 2024-07-23 Philip Boeken , Joris M. Mooij

Eigenvector overlaps in large sample covariance matrices and nonlinear shrinkage estimators

Consider a data matrix $Y = [\mathbf{y}_1, \cdots, \mathbf{y}_N]$ of size $M \times N$, where the columns are independent observations from a random vector $\mathbf{y}$ with zero mean and population covariance $\Sigma$. Let $\mathbf{u}_i$…

统计理论 · 数学 2024-07-23 Zeqin Lin , Guangming Pan

On the Asymptotic Normality of Trimmed and Winsorized L-statistics

There are several ways to establish the asymptotic normality of $L$-statistics, which depend on the choice of the weights-generating function and the cumulative distribution selection of the underlying model. In this study, we focus on…

统计理论 · 数学 2024-07-23 Chudamani Poudyal

Robust Signal Recovery in Hadamard Spaces

We analyze the stability of (strong) laws of large numbers in Hadamard spaces with respect to distributional perturbations. For the inductive means of a sequence of independent, but not necessarily identically distributed random variables,…

统计理论 · 数学 2024-07-23 Georg Köstenberger , Thomas Stark

Off-the-grid prediction and testing for linear combination of translated features

We consider a model where a signal (discrete or continuous) is observed with an additive Gaussian noise process. The signal is issued from a linear combination of a finite but increasing number of translated features. The features are…

统计理论 · 数学 2024-07-23 Cristina Butucea , Jean-François Delmas , Anne Dutfoy , Clément Hardy

Global optimality under amenable symmetry constraints

Consider a convex function that is invariant under an group of transformations. If it has a minimizer, does it also have an invariant minimizer? Variants of this problem appear in nonparametric statistics and in a number of adjacent fields.…

统计理论 · 数学 2024-07-22 Peter Orbanz

Universality laws for Gaussian mixtures in generalized linear models

Let $(x_{i}, y_{i})_{i=1,\dots,n}$ denote independent samples from a general mixture distribution $\sum_{c\in\mathcal{C}}\rho_{c}P_{c}^{x}$, and consider the hypothesis class of generalized linear models $\hat{y} = F(\Theta^{\top}x)$. In…

统计理论 · 数学 2024-07-22 Yatin Dandi , Ludovic Stephan , Florent Krzakala , Bruno Loureiro , Lenka Zdeborová

Time series on compact spaces, with an application to dynamic modeling of relative abundance data in Ecology

Motivated by the dynamic modeling of relative abundance data in ecology, we introduce a general approach to model stationary Markovian or non Markovian time series on (relatively) compact spaces such as a hypercube, the simplex or a sphere…

统计理论 · 数学 2024-07-22 Guillaume Franchi , Lionel Truquet

Information-theoretic convergence of extreme values to the Gumbel distribution

We show how convergence to the Gumbel distribution in an extreme value setting can be understood in an information-theoretic sense. We introduce a new type of score function which behaves well under the maximum operation, and which implies…

统计理论 · 数学 2024-07-22 Oliver Johnson

Bi-stochastically normalized graph Laplacian: convergence to manifold Laplacian and robustness to outlier noise

Bi-stochastic normalization provides an alternative normalization of graph Laplacians in graph-based data analysis and can be computed efficiently by Sinkhorn-Knopp (SK) iterations. This paper proves the convergence of bi-stochastically…

统计理论 · 数学 2024-07-19 Xiuyuan Cheng , Boris Landa

Parametric and nonparametric probability distribution estimators of sample maximum

Extreme value theory has constructed asymptotic properties of the sample maximum. This study concerns probability distribution estimation of the sample maximum. The traditional approach is parametric fitting to the limiting distribution --…

统计理论 · 数学 2024-07-19 Taku Moriyama

A new method of joint nonparametric estimation of probability density and its support

In this paper we propose a new method of joint nonparametric estimation of probability density and its support. As is well known, nonparametric kernel density estimator has "boundary bias problem" when the support of the population density…

统计理论 · 数学 2024-07-19 Taku Moriyama

Smoothed nonparametric two-sample tests

We propose new smoothed median and the Wilcoxon's rank sum test. As is pointed out by Maesono et al.(2016), some nonparametric discrete tests have a problem with their significance probability. Because of this problem, the selection of the…

统计理论 · 数学 2024-07-19 Taku Moriyama , Yoshihiko Maesono

A new kernel estimator of hazard ratio and its asymptotic mean squared error

The hazard function is a ratio of a density and survival function, and it is a basic tool of the survival analysis. In this paper we propose a kernel estimator of the hazard ratio function, which are based on a modification of \'{C}wik and…

统计理论 · 数学 2024-07-19 Taku Moriyama , Yoshihiko Maesono

Smoothed nonparametric tests and their properties

In this paper we propose new smoothed sign and Wilcoxon's signed rank tests, which are based on a kernel estimator of the underlying distribution function of data. We discuss approximations of $p$-values and asymptotic properties of these…

统计理论 · 数学 2024-07-19 Yoshihiko Maesono , Taku Moriyama , Mengxin Lu

On filter-type estimation of discretely sampled cyclic long-memory processes

The generalized filtered method of moments was developed in the recent papers by Alomari et al., 2020, and Ayache et al., 2022. It used functional data obtained from continuously sampled cyclic long-memory stochastic processes to…

统计理论 · 数学 2024-07-18 Antoine Ayache , Serhii Kravchenko , Andriy Olenko

Determine the Number of States in Hidden Markov Models via Marginal Likelihood

Hidden Markov models (HMM) have been widely used by scientists to model stochastic systems: the underlying process is a discrete Markov chain and the observations are noisy realizations of the underlying process. Determining the number of…

统计理论 · 数学 2024-07-18 Yang Chen , Cheng-Der Fuh , Chu-Lan Michael Kao

Universal Lower Bounds and Optimal Rates: Achieving Minimax Clustering Error in Sub-Exponential Mixture Models

Clustering is a pivotal challenge in unsupervised machine learning and is often investigated through the lens of mixture models. The optimal error rate for recovering cluster labels in Gaussian and sub-Gaussian mixture models involves ad…

统计理论 · 数学 2024-07-18 Maximilien Dreveton , Alperen Gözeten , Matthias Grossglauser , Patrick Thiran

An order-theoretic perspective on modes and maximum a posteriori estimation in Bayesian inverse problems

It is often desirable to summarise a probability measure on a space $X$ in terms of a mode, or MAP estimator, i.e.\ a point of maximum probability. Such points can be rigorously defined using masses of metric balls in the small-radius…

统计理论 · 数学 2024-07-18 Hefin Lambley , T. J. Sullivan

Estimating a density near an unknown manifold: a Bayesian nonparametric approach

We study the Bayesian density estimation of data living in the offset of an unknown submanifold of the Euclidean space. In this perspective, we introduce a new notion of anisotropic H\"older for the underlying density and obtain posterior…

统计理论 · 数学 2024-07-18 Clément Berenfeld , Paul Rosa , Judith Rousseau