Related papers: Provably Efficient Algorithms for Multi-Objective …

An Online Multiobjective Policy Gradient for Long-run Average-reward Markov Decision Process

We propose a reinforcement learning (RL) framework for multi-objective decision-making, where the agent seeks to optimize a vector of rewards rather than a single scalar value. The objective is to ensure that the time-averaged reward vector…

Systems and Control · Electrical Eng. & Systems 2025-11-18 Rahul Misra , Manuela L. Bujorianu , Rafał Wisniewski

Blackwell's Approachability with Approximation Algorithms

We revisit Blackwell's celebrated approachability problem which considers a repeated vector-valued game between a player and an adversary. Motivated by settings in which the action set of the player or adversary (or both) is difficult to…

Optimization and Control · Mathematics 2025-06-17 Dan Garber , Mhna Massalha

Replicable Reinforcement Learning with Linear Function Approximation

Replication of experimental results has been a challenge faced by many scientific disciplines, including the field of machine learning. Recent work on the theory of machine learning has formalized replicability as the demand that an…

Machine Learning · Computer Science 2026-04-15 Eric Eaton , Marcel Hussing , Michael Kearns , Aaron Roth , Sikata Bela Sengupta , Jessica Sorrell

Blackwell Approachability and Low-Regret Learning are Equivalent

We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. We show that Blackwell's result is equivalent, via efficient reductions, to the existence of "no-regret" algorithms for Online Linear…

Machine Learning · Computer Science 2010-11-10 Jacob Abernethy , Peter L. Bartlett , Elad Hazan

Response-Based Approachability and its Application to Generalized No-Regret Algorithms

Approachability theory, introduced by Blackwell (1956), provides fundamental results on repeated games with vector-valued payoffs, and has been usefully applied since in the theory of learning in games and to learning algorithms in the…

Machine Learning · Computer Science 2013-12-31 Andrey Bernstein , Nahum Shimkin

Refined approachability algorithms and application to regret minimization with global costs

Blackwell's approachability is a framework where two players, the Decision Maker and the Environment, play a repeated game with vector-valued payoffs. The goal of the Decision Maker is to make the average payoff converge to a given set…

Machine Learning · Computer Science 2021-09-08 Joon Kwon

Provable Self-Play Algorithms for Competitive Reinforcement Learning

Self-play, where the algorithm learns by playing against itself without requiring any direct supervision, has become the new weapon in modern Reinforcement Learning (RL) for achieving superhuman performance in practice. However, the…

Machine Learning · Computer Science 2020-07-10 Yu Bai , Chi Jin

Approachability in Stackelberg Stochastic Games with Vector Costs

The notion of approachability was introduced by Blackwell [1] in the context of vector-valued repeated games. The famous Blackwell's approachability theorem prescribes a strategy for approachability, i.e., for `steering' the average cost of…

Machine Learning · Computer Science 2016-06-22 Dileep Kalathil , Vivek Borkar , Rahul Jain

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Recent years have witnessed significant advances in reinforcement learning (RL), which has registered great success in solving various sequential decision-making problems in machine learning. Most of the successful RL applications, e.g.,…

Machine Learning · Computer Science 2021-04-30 Kaiqing Zhang , Zhuoran Yang , Tamer Başar

Faster Game Solving via Predictive Blackwell Approachability: Connecting Regret Matching and Mirror Descent

Blackwell approachability is a framework for reasoning about repeated games with vector-valued payoffs. We introduce predictive Blackwell approachability, where an estimate of the next payoff vector is given, and the decision maker tries to…

Computer Science and Game Theory · Computer Science 2021-03-09 Gabriele Farina , Christian Kroer , Tuomas Sandholm

Beyond Optimism: Exploration With Partially Observable Rewards

Exploration in reinforcement learning (RL) remains an open challenge. RL algorithms rely on observing rewards to train the agent, and if informative rewards are sparse the agent learns slowly or may not learn at all. To improve exploration…

Machine Learning · Computer Science 2024-11-12 Simone Parisi , Alireza Kazemipour , Michael Bowling

Provably Invincible Adversarial Attacks on Reinforcement Learning Systems: A Rate-Distortion Information-Theoretic Approach

Reinforcement learning (RL) for the Markov Decision Process (MDP) has emerged in many security-related applications, such as autonomous driving, financial decisions, and drone/robot algorithms. In order to improve the robustness/defense of…

Machine Learning · Computer Science 2025-10-16 Ziqing Lu , Lifeng Lai , Weiyu Xu

Provably Efficient Reinforcement Learning with Linear Function Approximation

Modern Reinforcement Learning (RL) is commonly applied to practical problems with an enormous number of states, where function approximation must be deployed to approximate either the value function or the policy. The introduction of…

Machine Learning · Computer Science 2019-08-09 Chi Jin , Zhuoran Yang , Zhaoran Wang , Michael I. Jordan

Provably Efficient Reinforcement Learning via Surprise Bound

Value function approximation is important in modern reinforcement learning (RL) problems especially when the state space is (infinitely) large. Despite the importance and wide applicability of value function approximation, its theoretical…

Machine Learning · Computer Science 2023-02-24 Hanlin Zhu , Ruosong Wang , Jason D. Lee

Provably Efficient Maximum Entropy Exploration

Suppose an agent is in a (possibly unknown) Markov Decision Process in the absence of a reward signal, what might we hope that an agent can efficiently learn to do? This work studies a broad class of objectives that are defined solely as…

Machine Learning · Computer Science 2019-01-29 Elad Hazan , Sham M. Kakade , Karan Singh , Abby Van Soest

Provably Efficient Multi-Task Reinforcement Learning with Model Transfer

We study multi-task reinforcement learning (RL) in tabular episodic Markov decision processes (MDPs). We formulate a heterogeneous multi-player RL problem, in which a group of players concurrently face similar but not necessarily identical…

Machine Learning · Computer Science 2022-01-19 Chicheng Zhang , Zhi Wang

Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning

Reinforcement learning (RL) -- algorithms that teach artificial agents to interact with environments by maximising reward signals -- has achieved significant success in recent years. These successes have been facilitated by advances in…

Machine Learning · Computer Science 2025-04-03 Llewyn Salt , Marcus Gallagher

Learning to Reach Goals via Iterated Supervised Learning

Current reinforcement learning (RL) algorithms can be brittle and difficult to use, especially when learning goal-reaching behaviors from sparse rewards. Although supervised imitation learning provides a simple and stable alternative, it…

Machine Learning · Computer Science 2020-10-06 Dibya Ghosh , Abhishek Gupta , Ashwin Reddy , Justin Fu , Coline Devin , Benjamin Eysenbach , Sergey Levine

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

Policy optimization methods with function approximation are widely used in multi-agent reinforcement learning. However, it remains elusive how to design such algorithms with statistical guarantees. Leveraging a multi-agent performance…

Machine Learning · Computer Science 2023-05-09 Yulai Zhao , Zhuoran Yang , Zhaoran Wang , Jason D. Lee

Average Reward Adjusted Discounted Reinforcement Learning: Near-Blackwell-Optimal Policies for Real-World Applications

Although in recent years reinforcement learning has become very popular the number of successful applications to different kinds of operations research problems is rather scarce. Reinforcement learning is based on the well-studied dynamic…

Machine Learning · Computer Science 2020-04-03 Manuel Schneckenreither