English
Related papers

Related papers: A Concentration Bound for LSPE($\lambda$)

200 papers

Given an ODE and its perturbation, the Alekseev formula expresses the solutions of the latter in terms related to the former. By exploiting this formula and a new concentration inequality for martingale-differences, we develop a novel…

Optimization and Control · Mathematics 2019-04-02 Gugan Thoppe , Vivek S. Borkar

This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear…

Machine Learning · Statistics 2021-06-03 Alain Durmus , Eric Moulines , Alexey Naumov , Sergey Samsonov , Kevin Scaman , Hoi-To Wai

In the field of reinforcement learning there has been recent progress towards safety and high-confidence bounds on policy performance. However, to our knowledge, no practical methods exist for determining high-confidence policy performance…

Artificial Intelligence · Computer Science 2018-06-26 Daniel S. Brown , Scott Niekum

Abstract dynamic programming models are used to analyze $\lambda$-policy iteration with randomization algorithms. Particularly, contractive models with infinite policies are considered and it is shown that well-posedness of the…

Systems and Control · Electrical Eng. & Systems 2020-06-12 Yuchao Li , Karl H. Johansson , Jonas Mårtensson

We obtain non asymptotic concentration bounds for two kinds of stochastic approximations. We first consider the deviations between the expectation of a given function of the Euler scheme of some diffusion process at a fixed deterministic…

Probability · Mathematics 2012-12-12 Noufel Frikha , Stephane Menozzi

Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of…

Machine Learning · Computer Science 2018-05-28 Haifang Li , Yingce Xia , Wensheng Zhang

We consider LSTD($\lambda$), the least-squares temporal-difference algorithm with eligibility traces algorithm proposed by Boyan (2002). It computes a linear approximation of the value function of a fixed policy in a large Markov Decision…

Machine Learning · Computer Science 2014-05-14 Manel Tagorti , Bruno Scherrer

Constraint tightening to non-conservatively guarantee recursive feasibility and stability in Stochastic Model Predictive Control is addressed. Stability and feasibility requirements are considered separately, highlighting the difference…

Systems and Control · Computer Science 2016-05-13 Matthias Lorenzen , Fabrizio Dabbene , Roberto Tempo , Frank Allgöwer

We study best-policy identification for finite-horizon risk-sensitive reinforcement learning under the entropic risk measure. Recent work established a constant gap in the exponential horizon dependence between lower and upper bounds on the…

Machine Learning · Computer Science 2026-05-14 Amer Essakine , Claire Vernade

In this paper we discuss $\l$-policy iteration, a method for exact and approximate dynamic programming. It is intermediate between the classical value iteration (VI) and policy iteration (PI) methods, and it is closely related to optimistic…

Systems and Control · Computer Science 2015-07-07 Dimitri P. Bertsekas

Viewing a two time scale stochastic approximation scheme as a noisy discretization of a singularly perturbed differential equation, we obtain a concentration bound for its iterates that captures its behavior with quantifiable high…

Optimization and Control · Mathematics 2018-06-29 Vivek S. Borkar , Sarath Pattathil

Analyzing probabilistic programs and randomized algorithms are classical problems in computer science. The first basic problem in the analysis of stochastic processes is to consider the expectation or mean, and another basic problem is to…

Programming Languages · Computer Science 2020-08-13 Jinyi Wang , Yican Sun , Hongfei Fu , Mingzhang Huang , Amir Kafshdar Goharshady , Krishnendu Chatterjee

We revisit the classical model of Tsitsiklis, Bertsekas and Athans for distributed stochastic approximation with consensus. The main result is an analysis of this scheme using the ODE approach to stochastic approximation, leading to a high…

Machine Learning · Statistics 2022-10-11 Harsh Dolhare , Vivek Borkar

We develop a new framework for deriving time-uniform concentration bounds for the output of stochastic sequential algorithms satisfying certain recursive inequalities akin to those defining the almost-supermartingale processes introduced by…

Statistics Theory · Mathematics 2025-11-25 Tuan Pham , Alessandro Rinaldo , Purnamrita Sarkar

In this paper, we obtain fundamental $\mathcal{L}_{p}$ bounds in sequential prediction and recursive algorithms via an entropic analysis. Both classes of problems are examined by investigating the underlying entropic relationships of the…

Machine Learning · Computer Science 2021-05-12 Song Fang , Quanyan Zhu

Compressed Sensing algorithms often make use of the hard thresholding operator to pass from dense vectors to their best s-sparse approximations. However, the output of the hard thresholding operator does not depend on any information from a…

Numerical Analysis · Mathematics 2020-10-15 Jonathan Ashbrock

During recent years the interest of optimization and machine learning communities in high-probability convergence of stochastic optimization methods has been growing. One of the main reasons for this is that high-probability complexity…

Randomized higher-order computation can be seen as being captured by a lambda calculus endowed with a single algebraic operation, namely a construct for binary probabilistic choice. What matters about such computations is the probability of…

Logic in Computer Science · Computer Science 2020-12-24 Ugo Dal Lago , Claudia Faggian , Simona Ronchi Della Rocca

We consider the problem of estimating the Optimized Certainty Equivalent (OCE) risk from independent and identically distributed (i.i.d.) samples. For the classic sample average approximation (SAA) of OCE, we derive mean-squared error as…

Machine Learning · Computer Science 2024-06-03 Ayon Ghosh , L. A. Prashanth , Krishna Jagannathan

Two-timescale Stochastic Approximation (SA) algorithms are widely used in Reinforcement Learning (RL). Their iterates have two parts that are updated using distinct stepsizes. In this work, we develop a novel recipe for their finite sample…

Artificial Intelligence · Computer Science 2018-06-06 Gal Dalal , Balazs Szorenyi , Gugan Thoppe , Shie Mannor
‹ Prev 1 2 3 10 Next ›