English
Related papers

Related papers: Learning Efficiently Function Approximation for Co…

200 papers

Contextual Markov decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable. While CMDPs serve…

Machine Learning · Computer Science 2024-02-06 Junze Deng , Yuan Cheng , Shaofeng Zou , Yingbin Liang

This paper studies systematic exploration for reinforcement learning with rich observations and function approximation. We introduce a new model called contextual decision processes, that unifies and generalizes most prior settings. Our…

Machine Learning · Computer Science 2016-12-02 Nan Jiang , Akshay Krishnamurthy , Alekh Agarwal , John Langford , Robert E. Schapire

Reinforcement learning (RL) typically models the interaction between the agent and environment as a Markov decision process (MDP), where the rewards that guide the agent's behavior are always observable. However, in many real-world…

Artificial Intelligence · Computer Science 2025-05-15 Montaser Mohammedalamen , Michael Bowling

Reinforcement learning in cooperative multi-agent settings has recently advanced significantly in its scope, with applications in cooperative estimation for advertising, dynamic treatment regimes, distributed control, and federated…

Machine Learning · Computer Science 2021-03-30 Abhimanyu Dubey , Alex Pentland

Markov decision processes (MDPs) are a well studied framework for solving sequential decision making problems under uncertainty. Exact methods for solving MDPs based on dynamic programming such as policy iteration and value iteration are…

Artificial Intelligence · Computer Science 2015-09-09 Yanping Huang

The aim of multi-task reinforcement learning is two-fold: (1) efficiently learn by training against multiple tasks and (2) quickly adapt, using limited samples, to a variety of new tasks. In this work, the tasks correspond to reward…

Machine Learning · Computer Science 2019-11-05 Nicholas C. Landolfi , Garrett Thomas , Tengyu Ma

In reinforcement learning (RL), when defining a Markov Decision Process (MDP), the environment dynamics is implicitly assumed to be stationary. This assumption of stationarity, while simplifying, can be unrealistic in many scenarios. In the…

Machine Learning · Computer Science 2021-10-15 Shagun Sodhani , Franziska Meier , Joelle Pineau , Amy Zhang

To generalize across tasks, an agent should acquire knowledge from past tasks that facilitate adaptation and exploration in future tasks. We focus on the problem of in-context adaptation and exploration, where an agent only relies on…

Machine Learning · Computer Science 2023-05-05 Chentian Jiang , Nan Rosemary Ke , Hado van Hasselt

We introduce Dynamic Contextual Markov Decision Processes (DCMDPs), a novel reinforcement learning framework for history-dependent environments that generalizes the contextual MDP framework to handle non-Markov environments, where contexts…

Machine Learning · Computer Science 2023-05-19 Guy Tennenholtz , Nadav Merlis , Lior Shani , Martin Mladenov , Craig Boutilier

We study the use of Temporal-Difference learning for estimating the structural parameters in dynamic discrete choice models. Our algorithms are based on the conditional choice probability approach but use functional approximations to…

Econometrics · Economics 2022-12-23 Karun Adusumilli , Dita Eckardt

In this paper, a review of model-free reinforcement learning for learning of dynamical systems in uncertain environments has discussed. For this purpose, the Markov Decision Process (MDP) will be reviewed. Furthermore, some learning…

Machine Learning · Computer Science 2019-05-21 Mehran Attar , Mohammadreza Dabirian

Learning reward functions for physical skills are challenging due to the vast spectrum of skills, the high-dimensionality of state and action space, and nuanced sensory feedback. The complexity of these tasks makes acquiring expert…

Robotics · Computer Science 2023-10-24 Yuwei Zeng , Yiqing Xu

One major limitation to the applicability of Reinforcement Learning (RL) to many practical domains is the large number of samples required to learn an optimal policy. To address this problem and improve learning efficiency, we consider a…

Machine Learning · Computer Science 2023-08-07 Roberto Cipollone , Giuseppe De Giacomo , Marco Favorito , Luca Iocchi , Fabio Patrizi

In this work, we consider the problem of collaborative multi-user reinforcement learning. In this setting there are multiple users with the same state-action space and transition probabilities but with different rewards. Under the…

Machine Learning · Computer Science 2023-05-23 Naman Agarwal , Prateek Jain , Suhas Kowshik , Dheeraj Nagaraj , Praneeth Netrapalli

Learned representations in deep reinforcement learning (DRL) have to extract task-relevant information from complex observations, balancing between robustness to distraction and informativeness to the policy. Such stable and rich…

Machine Learning · Computer Science 2021-10-28 Mete Kemertas , Tristan Aumentado-Armstrong

We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions, i.e. environments more general than (PO)MDPs. The task for an agent is to attain…

Machine Learning · Computer Science 2009-12-30 Daniil Ryabko , Marcus Hutter

Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots. Although contextual reinforcement learning provides a principled framework for learning and generalization of…

Machine Learning · Computer Science 2019-10-08 Pascal Klink , Hany Abdulsamad , Boris Belousov , Jan Peters

Context, the embedding of previous collected trajectories, is a powerful construct for Meta-Reinforcement Learning (Meta-RL) algorithms. By conditioning on an effective context, Meta-RL policies can easily generalize to new tasks within a…

Machine Learning · Computer Science 2020-12-16 Haotian Fu , Hongyao Tang , Jianye Hao , Chen Chen , Xidong Feng , Dong Li , Wulong Liu

We describe theoretical bounds and a practical algorithm for teaching a model by demonstration in a sequential decision making environment. Unlike previous efforts that have optimized learners that watch a teacher demonstrate a static…

Machine Learning · Computer Science 2012-10-19 Thomas J. Walsh , Sergiu Goschin

We consider the task of Inverse Reinforcement Learning in Contextual Markov Decision Processes (MDPs). In this setting, contexts, which define the reward and transition kernel, are sampled from a distribution. In addition, although the…

Machine Learning · Computer Science 2021-01-01 Stav Belogolovsky , Philip Korsunsky , Shie Mannor , Chen Tessler , Tom Zahavy
‹ Prev 1 2 3 10 Next ›