English
Related papers

Related papers: Batch Value-function Approximation with Only Reali…

200 papers

Modern Reinforcement Learning (RL) is commonly applied to practical problems with an enormous number of states, where function approximation must be deployed to approximate either the value function or the policy. The introduction of…

Machine Learning · Computer Science 2019-08-09 Chi Jin , Zhuoran Yang , Zhaoran Wang , Michael I. Jordan

The classical theory of reinforcement learning (RL) has focused on tabular and linear representations of value functions. Further progress hinges on combining RL with modern function approximators such as kernel functions and deep neural…

Machine Learning · Computer Science 2021-01-01 Zhuoran Yang , Chi Jin , Zhaoran Wang , Mengdi Wang , Michael I. Jordan

Value function approximation has demonstrated phenomenal empirical success in reinforcement learning (RL). Nevertheless, despite a handful of recent progress on developing theory for RL with linear function approximation, the understanding…

Machine Learning · Computer Science 2020-06-22 Ruosong Wang , Ruslan Salakhutdinov , Lin F. Yang

Value-function approximation methods that operate in batch mode have foundational importance to reinforcement learning (RL). Finite sample guarantees for these methods often crucially rely on two types of assumptions: (1) mild distribution…

Machine Learning · Computer Science 2019-05-02 Jinglin Chen , Nan Jiang

Bayesian Reinforcement Learning (RL) is capable of not only incorporating domain knowledge, but also solving the exploration-exploitation dilemma in a natural way. As Bayesian RL is intractable except for special cases, previous work has…

Artificial Intelligence · Computer Science 2013-06-14 Kenji Kawaguchi , Mauricio Araya

Reinforcement Learning with Verifiable Rewards (RLVR) has achieved great success in developing Large Language Models (LLMs) with chain-of-thought rollouts for many tasks such as math and coding. Nevertheless, RLVR struggles with sample…

Machine Learning · Computer Science 2026-05-15 Kai Yan , Alexander G. Schwing , Yu-Xiong Wang

Reinforcement Learning from Verifiable Rewards (RLVR) suffers from exploration inefficiency, where models struggle to generate successful rollouts, resulting in minimal learning signal. This challenge is particularly severe for tasks that…

Machine Learning · Computer Science 2026-03-20 Saaket Agashe , Jayanth Srinivasa , Gaowen Liu , Ramana Kompella , Xin Eric Wang

Batch reinforcement learning (RL) is important to apply RL algorithms to many high stakes tasks. Doing batch RL in a way that yields a reliable new policy in large domains is challenging: a new decision policy may visit states and actions…

Machine Learning · Computer Science 2020-07-23 Yao Liu , Adith Swaminathan , Alekh Agarwal , Emma Brunskill

Balancing exploration and exploitation remains a key challenge in reinforcement learning (RL). State-of-the-art RL algorithms suffer from high sample complexity, particularly in the sparse reward case, where they can do no better than to…

Machine Learning · Computer Science 2020-01-22 Philippe Morere , Gilad Francis , Tom Blau , Fabio Ramos

Value function approximation is important in modern reinforcement learning (RL) problems especially when the state space is (infinitely) large. Despite the importance and wide applicability of value function approximation, its theoretical…

Machine Learning · Computer Science 2023-02-24 Hanlin Zhu , Ruosong Wang , Jason D. Lee

Mathematical reasoning is a central challenge for large language models (LLMs), requiring not only correct answers but also faithful reasoning processes. Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a promising…

Machine Learning · Computer Science 2025-12-02 Md Tanvirul Alam , Nidhi Rastogi

We prove performance guarantees of two algorithms for approximating $Q^\star$ in batch reinforcement learning. Compared to classical iterative methods such as Fitted Q-Iteration---whose performance loss incurs quadratic dependence on…

Machine Learning · Computer Science 2020-08-25 Tengyang Xie , Nan Jiang

Accurate chart comprehension represents a critical challenge in advancing multimodal learning systems, as extensive information is compressed into structured visual representations. However, existing vision-language models (VLMs) frequently…

Machine Learning · Computer Science 2026-03-10 Xin Zhang , Xingyu Li , Rongguang Wang , Ruizhong Miao , Zheng Wang , Dan Roth , Chenyang Li

Reliant on too many experiments to learn good actions, current Reinforcement Learning (RL) algorithms have limited applicability in real-world settings, which can be too expensive to allow exploration. We propose an algorithm for batch RL,…

Machine Learning · Computer Science 2021-12-07 Rasool Fakoor , Jonas Mueller , Kavosh Asadi , Pratik Chaudhari , Alexander J. Smola

Kernel-based reinforcement learning (KBRL) stands out among reinforcement learning algorithms for its strong theoretical guarantees. By casting the learning problem as a local kernel approximation, KBRL provides a way of computing a…

Machine Learning · Computer Science 2014-07-22 André M. S. Barreto , Doina Precup , Joelle Pineau

Low-complexity models such as linear function representation play a pivotal role in enabling sample-efficient reinforcement learning (RL). The current paper pertains to a scenario with value-based linear representation, which postulates the…

Machine Learning · Computer Science 2021-10-19 Gen Li , Yuxin Chen , Yuejie Chi , Yuantao Gu , Yuting Wei

We consider the offline reinforcement learning problem, where the aim is to learn a decision making policy from logged data. Offline RL -- particularly when coupled with (value) function approximation to allow for generalization in large or…

Machine Learning · Computer Science 2022-08-31 Dylan J. Foster , Akshay Krishnamurthy , David Simchi-Levi , Yunzong Xu

This paper investigates the computational complexity of reinforcement learning in a novel linear function approximation regime, termed partial $q^{\pi}$-realizability. In this framework, the objective is to learn an $\epsilon$-optimal…

Artificial Intelligence · Computer Science 2025-10-31 Shayan Karimi , Xiaoqi Tan

Reward-free reinforcement learning (RL) is a framework which is suitable for both the batch RL setting and the setting where there are many reward functions of interest. During the exploration phase, an agent collects samples without using…

Machine Learning · Computer Science 2020-06-22 Ruosong Wang , Simon S. Du , Lin F. Yang , Ruslan Salakhutdinov

We study multi-objective reinforcement learning (RL) where an agent's reward is represented as a vector. In settings where an agent competes against opponents, its performance is measured by the distance of its average return vector to a…

Machine Learning · Computer Science 2021-02-08 Tiancheng Yu , Yi Tian , Jingzhao Zhang , Suvrit Sra
‹ Prev 1 2 3 10 Next ›