Related papers: Efficient Local Planning with Linear Function Appr…

Confident Approximate Policy Iteration for Efficient Local Planning in $q^\pi$-realizable MDPs

We consider approximate dynamic programming in $\gamma$-discounted Markov decision processes and apply it to approximate planning with linear value-function approximation. Our first contribution is a new variant of Approximate Policy…

Machine Learning · Computer Science 2022-10-31 Gellért Weisz , András György , Tadashi Kozuno , Csaba Szepesvári

On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function

We consider local planning in fixed-horizon MDPs with a generative model under the assumption that the optimal value function lies close to the span of a feature map. The generative model provides a local access to the MDP: The planner can…

Machine Learning · Computer Science 2021-07-12 Gellért Weisz , Philip Amortila , Barnabás Janzer , Yasin Abbasi-Yadkori , Nan Jiang , Csaba Szepesvári

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

We provide performance guarantees for a variant of simulation-based policy iteration for controlling Markov decision processes that involves the use of stochastic approximation algorithms along with state-of-the-art techniques that are…

Machine Learning · Computer Science 2022-10-17 Anna Winnicki , R. Srikant

Provably Efficient Reinforcement Learning with Linear Function Approximation

Modern Reinforcement Learning (RL) is commonly applied to practical problems with an enormous number of states, where function approximation must be deployed to approximate either the value function or the policy. The introduction of…

Machine Learning · Computer Science 2019-08-09 Chi Jin , Zhuoran Yang , Zhaoran Wang , Michael I. Jordan

Frozen Policy Iteration: Computationally Efficient RL under Linear $Q^{\pi}$ Realizability for Deterministic Dynamics

We study computationally and statistically efficient reinforcement learning under the linear $Q^{\pi}$ realizability assumption, where any policy's $Q$-function is linear in a given state-action feature representation. Prior methods in this…

Machine Learning · Computer Science 2026-03-03 Yijing Ke , Zihan Zhang , Ruosong Wang

Efficient Planning in Large MDPs with Weak Linear Function Approximation

Large-scale Markov decision processes (MDPs) require planning algorithms with runtime independent of the number of states of the MDP. We consider the planning problem in MDPs using linear value function approximation with only weak…

Machine Learning · Computer Science 2020-07-14 Roshan Shariff , Csaba Szepesvári

Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning

A practical challenge in reinforcement learning are combinatorial action spaces that make planning computationally demanding. For example, in cooperative multi-agent reinforcement learning, a potentially large number of agents jointly…

Machine Learning · Computer Science 2023-02-10 Volodymyr Tkachuk , Seyed Alireza Bakhtiari , Johannes Kirschner , Matej Jusup , Ilija Bogunovic , Csaba Szepesvári

Linear Bellman Completeness Suffices for Efficient Online Reinforcement Learning with Few Actions

One of the most natural approaches to reinforcement learning (RL) with function approximation is value iteration, which inductively generates approximations to the optimal value function by solving a sequence of regression problems. To…

Machine Learning · Computer Science 2024-06-19 Noah Golowich , Ankur Moitra

Provably Efficient $Q$-learning with Function Approximation via Distribution Shift Error Checking Oracle

$Q$-learning with function approximation is one of the most popular methods in reinforcement learning. Though the idea of using function approximation was proposed at least 60 years ago, even in the simplest setup, i.e, approximating…

Machine Learning · Computer Science 2019-11-05 Simon S. Du , Yuping Luo , Ruosong Wang , Hanrui Zhang

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

Policy optimization methods with function approximation are widely used in multi-agent reinforcement learning. However, it remains elusive how to design such algorithms with statistical guarantees. Leveraging a multi-agent performance…

Machine Learning · Computer Science 2023-05-09 Yulai Zhao , Zhuoran Yang , Zhaoran Wang , Jason D. Lee

Decentralized MCTS via Learned Teammate Models

Decentralized online planning can be an attractive paradigm for cooperative multi-agent systems, due to improved scalability and robustness. A key difficulty of such approach lies in making accurate predictions about the decisions of other…

Artificial Intelligence · Computer Science 2020-11-11 Aleksander Czechowski , Frans A. Oliehoek

Safe Policy Optimization with Local Generalized Linear Function Approximations

Safe exploration is a key to applying reinforcement learning (RL) in safety-critical systems. Existing safe exploration methods guaranteed safety under the assumption of regularity, and it has been difficult to apply them to large-scale…

Machine Learning · Computer Science 2021-11-10 Akifumi Wachi , Yunyue Wei , Yanan Sui

Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee

Local Policy Search is a popular reinforcement learning approach for handling large state spaces. Formally, it searches locally in a paramet erized policy space in order to maximize the associated value function averaged over some…

Machine Learning · Computer Science 2013-06-07 Bruno Scherrer , Matthieu Geist

Near-optimal Policy Identification in Active Reinforcement Learning

Many real-world reinforcement learning tasks require control of complex dynamical systems that involve both costly data acquisition processes and large state spaces. In cases where the transition dynamics can be readily evaluated at…

Machine Learning · Statistics 2022-12-20 Xiang Li , Viraj Mehta , Johannes Kirschner , Ian Char , Willie Neiswanger , Jeff Schneider , Andreas Krause , Ilija Bogunovic

On the Convergence of Reinforcement Learning with Monte Carlo Exploring Starts

A basic simulation-based reinforcement learning algorithm is the Monte Carlo Exploring States (MCES) method, also known as optimistic policy iteration, in which the value function is approximated by simulated returns and a greedy policy is…

Optimization and Control · Mathematics 2020-07-22 Jun Liu

On The Convergence Of Policy Iteration-Based Reinforcement Learning With Monte Carlo Policy Evaluation

A common technique in reinforcement learning is to evaluate the value function from Monte Carlo simulations of a given policy, and use the estimated value function to obtain a new policy which is greedy with respect to the estimated value…

Machine Learning · Computer Science 2023-03-01 Anna Winnicki , R. Srikant

On the Near-Optimality of Local Policies in Large Cooperative Multi-Agent Reinforcement Learning

We show that in a cooperative $N$-agent network, one can design locally executable policies for the agents such that the resulting discounted sum of average rewards (value) well approximates the optimal value computed over all (including…

Machine Learning · Computer Science 2022-09-09 Washim Uddin Mondal , Vaneet Aggarwal , Satish V. Ukkusuri

Flexible and Approximate Computation through State-Space Reduction

In the real world, insufficient information, limited computation resources, and complex problem structures often force an autonomous agent to make a decision in time less than that required to solve the problem at hand completely. Flexible…

Artificial Intelligence · Computer Science 2013-02-01 Weixiong Zhang

Computational Hardness of Reinforcement Learning with Partial $q^{\pi}$-Realizability

This paper investigates the computational complexity of reinforcement learning in a novel linear function approximation regime, termed partial $q^{\pi}$-realizability. In this framework, the objective is to learn an $\epsilon$-optimal…

Artificial Intelligence · Computer Science 2025-10-31 Shayan Karimi , Xiaoqi Tan

Online Planning in POMDPs with Self-Improving Simulators

How can we plan efficiently in a large and complex environment when the time budget is limited? Given the original simulator of the environment, which may be computationally very demanding, we propose to learn online an approximate but much…

Artificial Intelligence · Computer Science 2022-12-14 Jinke He , Miguel Suau , Hendrik Baier , Michael Kaisers , Frans A. Oliehoek