Related papers: A Concentration Bound for LSPE($\lambda$)

A Concentration Bound for Stochastic Approximation via Alekseev's Formula

Given an ODE and its perturbation, the Alekseev formula expresses the solutions of the latter in terms related to the former. By exploiting this formula and a new concentration inequality for martingale-differences, we develop a novel…

Optimization and Control · Mathematics 2019-04-02 Gugan Thoppe , Vivek S. Borkar

Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear…

Machine Learning · Statistics 2021-06-03 Alain Durmus , Eric Moulines , Alexey Naumov , Sergey Samsonov , Kevin Scaman , Hoi-To Wai

Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning

In the field of reinforcement learning there has been recent progress towards safety and high-confidence bounds on policy performance. However, to our knowledge, no practical methods exist for determining high-confidence policy performance…

Artificial Intelligence · Computer Science 2018-06-26 Daniel S. Brown , Scott Niekum

Lambda-Policy Iteration with Randomization for Contractive Models with Infinite Policies: Well-Posedness and Convergence (Extended Version)

Abstract dynamic programming models are used to analyze $\lambda$-policy iteration with randomization algorithms. Particularly, contractive models with infinite policies are considered and it is shown that well-posedness of the…

Systems and Control · Electrical Eng. & Systems 2020-06-12 Yuchao Li , Karl H. Johansson , Jonas Mårtensson

Concentration Bounds for Stochastic Approximations

We obtain non asymptotic concentration bounds for two kinds of stochastic approximations. We first consider the deviations between the expectation of a given function of the Euler scheme of some diffusion process at a fixed deterministic…

Probability · Mathematics 2012-12-12 Noufel Frikha , Stephane Menozzi

Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of…

Machine Learning · Computer Science 2018-05-28 Haifang Li , Yingce Xia , Wensheng Zhang

Rate of Convergence and Error Bounds for LSTD($\lambda$)

We consider LSTD($\lambda$), the least-squares temporal-difference algorithm with eligibility traces algorithm proposed by Boyan (2002). It computes a linear approximation of the value function of a fixed policy in a large Markov Decision…

Machine Learning · Computer Science 2014-05-14 Manel Tagorti , Bruno Scherrer

Constraint-Tightening and Stability in Stochastic Model Predictive Control

Constraint tightening to non-conservatively guarantee recursive feasibility and stability in Stochastic Model Predictive Control is addressed. Stability and feasibility requirements are considered separately, highlighting the difference…

Systems and Control · Computer Science 2016-05-13 Matthias Lorenzen , Fabrizio Dabbene , Roberto Tempo , Frank Allgöwer

Tight Sample Complexity Bounds for Entropic Best Policy Identification

We study best-policy identification for finite-horizon risk-sensitive reinforcement learning under the entropic risk measure. Recent work established a constant gap in the exponential horizon dependence between lower and upper bounds on the…

Machine Learning · Computer Science 2026-05-14 Amer Essakine , Claire Vernade

Lambda-Policy Iteration: A Review and a New Implementation

In this paper we discuss $\l$-policy iteration, a method for exact and approximate dynamic programming. It is intermediate between the classical value iteration (VI) and policy iteration (PI) methods, and it is closely related to optimistic…

Systems and Control · Computer Science 2015-07-07 Dimitri P. Bertsekas

Concentration bounds for two time scale stochastic approximation

Viewing a two time scale stochastic approximation scheme as a noisy discretization of a singularly perturbed differential equation, we obtain a concentration bound for its iterates that captures its behavior with quantifiable high…

Optimization and Control · Mathematics 2018-06-29 Vivek S. Borkar , Sarath Pattathil

Concentration-Bound Analysis for Probabilistic Programs and Probabilistic Recurrence Relations

Analyzing probabilistic programs and randomized algorithms are classical problems in computer science. The first basic problem in the analysis of stochastic processes is to consider the expectation or mean, and another basic problem is to…

Programming Languages · Computer Science 2020-08-13 Jinyi Wang , Yican Sun , Hongfei Fu , Mingzhang Huang , Amir Kafshdar Goharshady , Krishnendu Chatterjee

A Concentration Bound for Distributed Stochastic Approximation

We revisit the classical model of Tsitsiklis, Bertsekas and Athans for distributed stochastic approximation with consensus. The main result is an analysis of this scheme using the ODE approach to stochastic approximation, leading to a high…

Machine Learning · Statistics 2022-10-11 Harsh Dolhare , Vivek Borkar

Time-uniform concentration bounds for iterative algorithms

We develop a new framework for deriving time-uniform concentration bounds for the output of stochastic sequential algorithms satisfying certain recursive inequalities akin to those defining the almost-supermartingale processes introduced by…

Statistics Theory · Mathematics 2025-11-25 Tuan Pham , Alessandro Rinaldo , Purnamrita Sarkar

Fundamental Limitations in Sequential Prediction and Recursive Algorithms: $\mathcal{L}_{p}$ Bounds via an Entropic Analysis

In this paper, we obtain fundamental $\mathcal{L}_{p}$ bounds in sequential prediction and recursive algorithms via an entropic analysis. Both classes of problems are examined by investigating the underlying entropic relationships of the…

Machine Learning · Computer Science 2021-05-12 Song Fang , Quanyan Zhu

An Alternative Thresholding Rule for Compressed Sensing

Compressed Sensing algorithms often make use of the hard thresholding operator to pass from dense vectors to their best s-sparse approximations. However, the output of the hard thresholding operator does not depend on any information from a…

Numerical Analysis · Mathematics 2020-10-15 Jonathan Ashbrock

High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance

During recent years the interest of optimization and machine learning communities in high-probability convergence of stochastic optimization methods has been growing. One of the main reasons for this is that high-probability complexity…

Optimization and Control · Mathematics 2023-07-19 Abdurakhmon Sadiev , Marina Danilova , Eduard Gorbunov , Samuel Horváth , Gauthier Gidel , Pavel Dvurechensky , Alexander Gasnikov , Peter Richtárik

Intersection Types and (Positive) Almost-Sure Termination

Randomized higher-order computation can be seen as being captured by a lambda calculus endowed with a single algebraic operation, namely a construct for binary probabilistic choice. What matters about such computations is the probability of…

Logic in Computer Science · Computer Science 2020-12-24 Ugo Dal Lago , Claudia Faggian , Simona Ronchi Della Rocca

Concentration Bounds for Optimized Certainty Equivalent Risk Estimation

We consider the problem of estimating the Optimized Certainty Equivalent (OCE) risk from independent and identically distributed (i.i.d.) samples. For the classic sample average approximation (SAA) of OCE, we derive mean-squared error as…

Machine Learning · Computer Science 2024-06-03 Ayon Ghosh , L. A. Prashanth , Krishna Jagannathan

Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning

Two-timescale Stochastic Approximation (SA) algorithms are widely used in Reinforcement Learning (RL). Their iterates have two parts that are updated using distinct stepsizes. In this work, we develop a novel recipe for their finite sample…

Artificial Intelligence · Computer Science 2018-06-06 Gal Dalal , Balazs Szorenyi , Gugan Thoppe , Shie Mannor