统计理论
Selective prediction, where a model has the option to abstain from making a decision, is crucial for machine learning applications in which mistakes are costly. In this work, we focus on distributional regression and introduce a framework…
Estimating the mode of a unimodal distribution is a classical problem in statistics. Although there are several approaches for point-estimation of mode in the literature, very little has been explored about the interval-estimation of mode.…
We study the bias and the mean-squared error of the maximum likelihood estimators (MLE) of parameters associated with a two-parameter mean-reverting process for a finite time $T$. Using the likelihood ratio process, we derive the…
Under certain conditions, the largest eigenvalue of a sample covariance matrix undergoes a well-known phase transition when the sample size $n$ and data dimension $p$ diverge proportionally. In the subcritical regime, this eigenvalue has…
In this paper, we consider the reproducing property in Reproducing Kernel Hilbert Spaces (RKHS). We establish a reproducing property for the closure of the class of combinations of composition operators under minimal conditions. This allows…
A recent line of work provides new statistical tools based on game-theory and achieves safe anytime-valid inference without assuming regularity conditions. In particular, the framework of universal inference proposed by Wasserman, Ramdas…
We introduce a new predictive mechanism that operates in the presence of hidden confounding across distributionally diverse data sources while ensuring consistent estimation of causal parameters-despite their recognized suboptimality for…
The Peaks Over Threshold (POT) method is the most popular statistical method for the analysis of univariate extremes. Even though there is a rich applied literature on Bayesian inference for the POT, the asymptotic theory for such proposals…
Frailty models are essential tools in survival analysis for addressing unobserved heterogeneity and random effects in the data. These models incorporate a random effect, the frailty, which is assumed to impact the hazard rate…
The extremal dependence structure of a regularly varying $d$-dimensional random vector can be described by its angular measure. The standard nonparametric estimator of this measure is the empirical measure of the observed angles of the $k$…
We study the problem of parametric estimation for continuously observed stochastic differential equation driven by fractional Brownian motion. Under some assumptions on drift and diffusion coefficients, we construct maximum likelihood…
The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular, the sublevel sets of the distance function are used in the computation of persistent homology -- a backbone of the…
In this paper, we analyze the relative errors in various reliability measures due to the tacit assumption that the components associated with a $n$-component series system or a parallel system are independently working where the components…
In this paper, we analyze the relative errors that crop up in the various reliability measures due to the tacit assumption that the components are independently working associated with a $n$-component series system or a parallel system…
Variable selection comprises an important step in many modern statistical inference procedures. In the regression setting, when estimators cannot shrink irrelevant signals to zero, covariates without relationships to the response often…
Causal models in statistics are often described using acyclic directed mixed graphs (ADMGs), which contain directed and bidirected edges and no directed cycles. This article surveys various interpretations of ADMGs, discusses their…
We consider experimentation in the presence of non-stationarity, inter-unit (spatial) interference, and carry-over effects (temporal interference), where we wish to estimate the global average treatment effect (GATE), the difference between…
In Learning Theory, the smoothness assumption on the target function (known as source condition) is a key factor in establishing theoretical convergence rates for an estimator. The existing general form of the source condition, as discussed…
We study the nonparametric maximum likelihood estimator $\widehat{\pi}$ for Gaussian location mixtures in one dimension. It has been known since (Lindsay, 1983) that given an $n$-point dataset, this estimator always returns a mixture with…
Statistical learning methods typically assume that the training and test data originate from the same distribution, enabling effective risk minimization. However, real-world applications frequently involve distributional shifts, leading to…