Related papers: Pseudo-maximization and self-normalized processes
Self-normalized processes arise naturally in statistical applications. Being unit free, they are not affected by scale changes. Moreover, self-normalization often eliminates or weakens moment assumptions. In this paper we present several…
Score-based tests have been used to study parameter heterogeneity across many types of statistical models. This chapter describes a new self-normalization approach for score-based tests of mixed models, which addresses situations where…
Self-normalized processes arise naturally in many learning-related tasks. While self-normalized concentration has been extensively studied for scalar-valued processes, there are few results for multidimensional processes outside of the…
This paper proposes self-normalized tests for multistep conditional predictive ability in forecast comparison. By normalizing the sample mean of the transformed loss differential using functionals of its cumulative sum (CUSUM) process,…
This work unifies the analysis of various randomized methods for solving linear and nonlinear inverse problems by framing the problem in a stochastic optimization setting. By doing so, we show that many randomized methods are variants of a…
A new method, called the method of self-similar approximants, and its recent developments are described. The method is based on the ideas of renormalization group theory and optimal control theory. It allows for the effective extrapolation…
We derive theorems which outline explicit mechanisms by which anomalous scaling for the probability density function of the sum of many correlated random variables asymptotically prevails. The results characterize general anomalous scaling…
Pseudo-variograms appear naturally in the context of multivariate Brown-Resnick processes, and are a useful tool for analysis and prediction of multivariate random fields. We give a necessary and sufficient criterion for a matrix-valued…
Much of statistics relies upon four key elements: a law of large numbers, a calculus to operationalize stochastic convergence, a central limit theorem, and a framework for constructing local approximations. These elements are…
To mitigate the problem of having to traverse over the full vocabulary in the softmax normalization of a neural language model, sampling-based training criteria are proposed and investigated in the context of large vocabulary word-based…
Testing hypotheses is an issue of primary importance in the scientific research, as well as in many other human activities. Much clarification about it can be achieved if the process of learning from data is framed in a stochastic model of…
Assuming a $q$-variant of the prime $k$-tuple conjecture uniformly, we compute mixed moments of the number of primes in disjoint short intervals and progressions, respectively. This involves estimating the mean of singular series along…
Calculation of the log-normalizer is a major computational obstacle in applications of log-linear models with large output spaces. The problem of fast normalizer computation has therefore attracted significant attention in the theoretical…
We propose a new method to construct confidence intervals for quantities that are associated with a stationary time series, which avoids direct estimation of the asymptotic variances. Unlike the existing tuning-parameter-dependent…
A new bivariate partial sum process for locally stationary time series is introduced and its weak convergence to a Brownian sheet is established. This construction enables the development of a novel self-normalized CUSUM test statistic for…
Stochastic optimisation algorithms are the de facto standard for machine learning with large amounts of data. Handling only a subset of available data in each optimisation step dramatically reduces the per-iteration computational costs,…
In this paper, we study pseudo-labelling. Pseudo-labelling employs raw inferences on unlabelled data as pseudo-labels for self-training. We elucidate the empirical successes of pseudo-labelling by establishing a link between this technique…
Majorization-minimization algorithms consist of iteratively minimizing a majorizing surrogate of an objective function. Because of its simplicity and its wide applicability, this principle has been very popular in statistics and in signal…
We study statistical inferences for a class of modulated stationary processes with time-dependent variances. Due to non-stationarity and the large number of unknown parameters, existing methods for stationary, or locally stationary, time…
In this article, we consider the problem of simultaneous testing of hypotheses when the individual test statistics are not necessarily independent. Specifically, we consider the problem of simultaneous testing of point null hypotheses…