Related papers: Learning Efficiently Function Approximation for Co…

Sample Complexity Characterization for Linear Contextual MDPs

Contextual Markov decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable. While CMDPs serve…

Machine Learning · Computer Science 2024-02-06 Junze Deng , Yuan Cheng , Shaofeng Zou , Yingbin Liang

Contextual Decision Processes with Low Bellman Rank are PAC-Learnable

This paper studies systematic exploration for reinforcement learning with rich observations and function approximation. We introduce a new model called contextual decision processes, that unifies and generalizes most prior settings. Our…

Machine Learning · Computer Science 2016-12-02 Nan Jiang , Akshay Krishnamurthy , Alekh Agarwal , John Langford , Robert E. Schapire

Generalization in Monitored Markov Decision Processes (Mon-MDPs)

Reinforcement learning (RL) typically models the interaction between the agent and environment as a Markov decision process (MDP), where the rewards that guide the agent's behavior are always observable. However, in many real-world…

Artificial Intelligence · Computer Science 2025-05-15 Montaser Mohammedalamen , Michael Bowling

Provably Efficient Cooperative Multi-Agent Reinforcement Learning with Function Approximation

Reinforcement learning in cooperative multi-agent settings has recently advanced significantly in its scope, with applications in cooperative estimation for advertising, dynamic treatment regimes, distributed control, and federated…

Machine Learning · Computer Science 2021-03-30 Abhimanyu Dubey , Alex Pentland

Learning Efficient Representations for Reinforcement Learning

Markov decision processes (MDPs) are a well studied framework for solving sequential decision making problems under uncertainty. Exact methods for solving MDPs based on dynamic programming such as policy iteration and value iteration are…

Artificial Intelligence · Computer Science 2015-09-09 Yanping Huang

A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning

The aim of multi-task reinforcement learning is two-fold: (1) efficiently learn by training against multiple tasks and (2) quickly adapt, using limited samples, to a variety of new tasks. In this work, the tasks correspond to reward…

Machine Learning · Computer Science 2019-11-05 Nicholas C. Landolfi , Garrett Thomas , Tengyu Ma

Block Contextual MDPs for Continual Learning

In reinforcement learning (RL), when defining a Markov Decision Process (MDP), the environment dynamics is implicitly assumed to be stationary. This assumption of stationarity, while simplifying, can be unrealistic in many scenarios. In the…

Machine Learning · Computer Science 2021-10-15 Shagun Sodhani , Franziska Meier , Joelle Pineau , Amy Zhang

Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration

To generalize across tasks, an agent should acquire knowledge from past tasks that facilitate adaptation and exploration in future tasks. We focus on the problem of in-context adaptation and exploration, where an agent only relies on…

Machine Learning · Computer Science 2023-05-05 Chentian Jiang , Nan Rosemary Ke , Hado van Hasselt

Reinforcement Learning with History-Dependent Dynamic Contexts

We introduce Dynamic Contextual Markov Decision Processes (DCMDPs), a novel reinforcement learning framework for history-dependent environments that generalizes the contextual MDP framework to handle non-Markov environments, where contexts…

Machine Learning · Computer Science 2023-05-19 Guy Tennenholtz , Nadav Merlis , Lior Shani , Martin Mladenov , Craig Boutilier

Temporal-Difference estimation of dynamic discrete choice models

We study the use of Temporal-Difference learning for estimating the structural parameters in dynamic discrete choice models. Our algorithms are based on the conditional choice probability approach but use functional approximations to…

Econometrics · Economics 2022-12-23 Karun Adusumilli , Dita Eckardt

Reinforcement Learning for Learning of Dynamical Systems in Uncertain Environment: a Tutorial

In this paper, a review of model-free reinforcement learning for learning of dynamical systems in uncertain environments has discussed. For this purpose, the Markov Decision Process (MDP) will be reviewed. Furthermore, some learning…

Machine Learning · Computer Science 2019-05-21 Mehran Attar , Mohammadreza Dabirian

Learning Reward for Physical Skills using Large Language Model

Learning reward functions for physical skills are challenging due to the vast spectrum of skills, the high-dimensionality of state and action space, and nuanced sensory feedback. The complexity of these tasks makes acquiring expert…

Robotics · Computer Science 2023-10-24 Yuwei Zeng , Yiqing Xu

Exploiting Multiple Abstractions in Episodic RL via Reward Shaping

One major limitation to the applicability of Reinforcement Learning (RL) to many practical domains is the large number of samples required to learn an optimal policy. To address this problem and improve learning efficiency, we consider a…

Machine Learning · Computer Science 2023-08-07 Roberto Cipollone , Giuseppe De Giacomo , Marco Favorito , Luca Iocchi , Fabio Patrizi

Multi-User Reinforcement Learning with Low Rank Rewards

In this work, we consider the problem of collaborative multi-user reinforcement learning. In this setting there are multiple users with the same state-action space and transition probabilities but with different rewards. Under the…

Machine Learning · Computer Science 2023-05-23 Naman Agarwal , Prateek Jain , Suhas Kowshik , Dheeraj Nagaraj , Praneeth Netrapalli

Towards Robust Bisimulation Metric Learning

Learned representations in deep reinforcement learning (DRL) have to extract task-relevant information from complex observations, balancing between robustness to distraction and informativeness to the policy. Such stable and rich…

Machine Learning · Computer Science 2021-10-28 Mete Kemertas , Tristan Aumentado-Armstrong

On the Possibility of Learning in Reactive Environments with Arbitrary Dependence

We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions, i.e. environments more general than (PO)MDPs. The task for an agent is to attain…

Machine Learning · Computer Science 2009-12-30 Daniil Ryabko , Marcus Hutter

Self-Paced Contextual Reinforcement Learning

Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots. Although contextual reinforcement learning provides a principled framework for learning and generalization of…

Machine Learning · Computer Science 2019-10-08 Pascal Klink , Hany Abdulsamad , Boris Belousov , Jan Peters

Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning

Context, the embedding of previous collected trajectories, is a powerful construct for Meta-Reinforcement Learning (Meta-RL) algorithms. By conditioning on an effective context, Meta-RL policies can easily generalize to new tasks within a…

Machine Learning · Computer Science 2020-12-16 Haotian Fu , Hongyao Tang , Jianye Hao , Chen Chen , Xidong Feng , Dong Li , Wulong Liu

Dynamic Teaching in Sequential Decision Making Environments

We describe theoretical bounds and a practical algorithm for teaching a model by demonstration in a sequential decision making environment. Unlike previous efforts that have optimized learners that watch a teacher demonstrate a static…

Machine Learning · Computer Science 2012-10-19 Thomas J. Walsh , Sergiu Goschin

Inverse Reinforcement Learning in Contextual MDPs

We consider the task of Inverse Reinforcement Learning in Contextual Markov Decision Processes (MDPs). In this setting, contexts, which define the reward and transition kernel, are sampled from a distribution. In addition, although the…

Machine Learning · Computer Science 2021-01-01 Stav Belogolovsky , Philip Korsunsky , Shie Mannor , Chen Tessler , Tom Zahavy