Related papers: SALSA: Sequential Approximate Leverage-Score Algor…
We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive…
The statistical leverage scores of a matrix $A$ are the squared row-norms of the matrix containing its (top) left singular vectors and the coherence is the largest leverage score. These quantities are of interest in recently-popular…
We develop a new efficient algorithm for the analysis of large-scale time series data. We firstly define rolling averages, derive their analytical properties, and establish their asymptotic distribution. These theoretical results are…
A new likelihood based AR approximation is given for ARMA models. The usual algorithms for the computation of the likelihood of an ARMA model require $O(n)$ flops per function evaluation. Using our new approximation, an algorithm is…
We propose a multilevel stochastic approximation (MLSA) scheme for the computation of the value-at-risk (VaR) and expected shortfall (ES) of a financial loss, which can only be computed via simulations conditionally on the realisation of…
A matrix algorithm runs superfast (aka at sublinear cost) if it involves much fewer flops and memory cells than an input matrix has entries. Big Data are frequently represented by matrices of immense sizes that cannot be handled directly…
We present a new algorithm for finding a near optimal low-rank approximation of a matrix $A$ in $O(nnz(A))$ time. Our method is based on a recursive sampling scheme for computing a representative subset of $A$'s columns, which is then used…
Leverage score sampling provides an appealing way to perform approximate computations for large matrices. Indeed, it allows to derive faithful approximations with a complexity adapted to the problem at hand. Yet, performing leverage scores…
We propose a statistical adaptive procedure called SALSA for automatically scheduling the learning rate (step size) in stochastic gradient methods. SALSA first uses a smoothed stochastic line-search procedure to gradually increase the…
In time series analysis, when fitting an autoregressive model, one must solve a Toeplitz ordinary least squares problem numerous times to find an appropriate model, which can severely affect computational times with large data sets. Two…
Suppose a matrix $A \in \mathbb{R}^{m \times n}$ of rank $r$ with singular value decomposition $A = U_{A}\Sigma_{A} V_{A}^{T}$, where $U_{A} \in \mathbb{R}^{m \times r}$, $V_{A} \in \mathbb{R}^{n \times r}$ are orthonormal and $\Sigma_{A}…
This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear…
One popular method for dealing with large-scale data sets is sampling. For example, by using the empirical statistical leverage scores as an importance sampling distribution, the method of algorithmic leveraging samples and rescales…
There has been significant interest and progress recently in algorithms that solve regression problems involving tall and thin matrices in input sparsity time. These algorithms find shorter equivalent of a n*d matrix where n >> d, which…
Harnessing pre-trained LLMs to improve ASR systems, particularly for low-resource languages, is now an emerging area of research. Existing methods range from using LLMs for ASR error correction to tightly coupled systems that replace the…
We propose an efficient algorithm for the generalized sparse coding (SC) inference problem. The proposed framework applies to both the single dictionary setting, where each data point is represented as a sparse combination of the columns of…
SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.\ data, where a single sample trajectory is…
Iterative refinement is particularly popular for numerical solution of linear systems of equations. We extend it to Low Rank Approximation of a matrix (LRA) and observe close link of the resulting algorithm to oversampling techniques,…
In statistics and machine learning, logistic regression is a widely-used supervised learning technique primarily employed for binary classification tasks. When the number of observations greatly exceeds the number of predictor variables, we…
This paper presents a stochastic approximation proximal subgradient (SAPS) method for stochastic convex-concave minimax optimization. By accessing unbiased and variance bounded approximate subgradients, we show that this algorithm exhibits…