Related papers: Data-Driven Knowledge Transfer in Batch $Q^*$ Lear…

Transfer Q-learning

Time-inhomogeneous finite-horizon Markov decision processes (MDP) are frequently employed to model decision-making in dynamic treatment regimes and other statistical reinforcement learning (RL) scenarios. These fields, especially healthcare…

Machine Learning · Computer Science 2025-10-21 Elynn Chen , Sai Li , Michael I. Jordan

$QD$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

The paper considers a class of multi-agent Markov decision processes (MDPs), in which the network agents respond differently (as manifested by the instantaneous one-stage random costs) to a global controlled state and the control actions of…

Machine Learning · Statistics 2015-06-04 Soummya Kar , Jose' M. F. Moura , H. Vincent Poor

Learn Dynamic-Aware State Embedding for Transfer Learning

Transfer reinforcement learning aims to improve the sample efficiency of solving unseen new tasks by leveraging experiences obtained from previous tasks. We consider the setting where all tasks (MDPs) share the same environment dynamic…

Machine Learning · Computer Science 2021-01-08 Kaige Yang

Approximate Q-Learning for Controlled Diffusion Processes and its Near Optimality

We study a Q learning algorithm for continuous time stochastic control problems. The proposed algorithm uses the sampled state process by discretizing the state and control action spaces under piece-wise constant control processes. We show…

Optimization and Control · Mathematics 2023-03-10 Erhan Bayraktar , Ali Devran Kara

Robust Batch Policy Learning in Markov Decision Processes

We study the offline data-driven sequential decision making problem in the framework of Markov decision process (MDP). In order to enhance the generalizability and adaptivity of the learned policy, we propose to evaluate each policy by a…

Statistics Theory · Mathematics 2021-11-11 Zhengling Qi , Peng Liao

Reinforcement Learning in Switching Non-Stationary Markov Decision Processes: Algorithms and Convergence Analysis

Reinforcement learning in non-stationary environments is challenging due to abrupt and unpredictable changes in dynamics, often causing traditional algorithms to fail to converge. However, in many real-world cases, non-stationarity has some…

Machine Learning · Computer Science 2025-03-25 Mohsen Amiri , Sindri Magnússon

Deep Transfer $Q$-Learning for Offline Non-Stationary Reinforcement Learning

In dynamic decision-making scenarios across business and healthcare, leveraging sample trajectories from diverse populations can significantly enhance reinforcement learning (RL) performance for specific target populations, especially when…

Machine Learning · Statistics 2025-04-15 Jinhang Chai , Elynn Chen , Jianqing Fan

A Markov Decision Process Approach to Active Meta Learning

In supervised learning, we fit a single statistical model to a given data set, assuming that the data is associated with a singular task, which yields well-tuned models for specific use, but does not adapt well to new contexts. By contrast,…

Machine Learning · Computer Science 2020-09-11 Bingjia Wang , Alec Koppel , Vikram Krishnamurthy

The Impact of Data Distribution on Q-learning with Function Approximation

We study the interplay between the data distribution and Q-learning-based algorithms with function approximation. We provide a unified theoretical and empirical analysis as to how different properties of the data distribution influence the…

Machine Learning · Computer Science 2023-02-13 Pedro P. Santos , Diogo S. Carvalho , Alberto Sardinha , Francisco S. Melo

Expert-Guided Symmetry Detection in Markov Decision Processes

Learning a Markov Decision Process (MDP) from a fixed batch of trajectories is a non-trivial task whose outcome's quality depends on both the amount and the diversity of the sampled regions of the state-action space. Yet, many MDPs are…

Machine Learning · Computer Science 2022-03-08 Giorgio Angelotti , Nicolas Drougard , Caroline P. C. Chanel

Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model

The curse of dimensionality is a widely known issue in reinforcement learning (RL). In the tabular setting where the state space $\mathcal{S}$ and the action space $\mathcal{A}$ are both finite, to obtain a nearly optimal policy with…

Machine Learning · Computer Science 2022-10-28 Bingyan Wang , Yuling Yan , Jianqing Fan

A Unified Meta-Learning Framework for Dynamic Transfer Learning

Transfer learning refers to the transfer of knowledge or information from a relevant source task to a target task. However, most existing works assume both tasks are sampled from a stationary task distribution, thereby leading to the…

Machine Learning · Computer Science 2022-07-06 Jun Wu , Jingrui He

Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis

In Markov decision processes (MDPs), quantile risk measures such as Value-at-Risk are a standard metric for modeling RL agents' preferences for certain outcomes. This paper proposes a new Q-learning algorithm for quantile optimization in…

Machine Learning · Computer Science 2024-11-01 Jia Lin Hau , Erick Delage , Esther Derman , Mohammad Ghavamzadeh , Marek Petrik

BATS: Best Action Trajectory Stitching

The problem of offline reinforcement learning focuses on learning a good policy from a log of environment interactions. Past efforts for developing algorithms in this area have revolved around introducing constraints to online reinforcement…

Machine Learning · Computer Science 2022-04-27 Ian Char , Viraj Mehta , Adam Villaflor , John M. Dolan , Jeff Schneider

SMDP-Based Dynamic Batching for Improving Responsiveness and Energy Efficiency of Batch Services

For servers incorporating parallel computing resources, batching is a pivotal technique for providing efficient and economical services at scale. Parallel computing resources exhibit heightened computational and energy efficiency when…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-07 Yaodan Xu , Sheng Zhou , Zhisheng Niu

Transition Transfer $Q$-Learning for Composite Markov Decision Processes

To bridge the gap between empirical success and theoretical understanding in transfer reinforcement learning (RL), we study a principled approach with provable performance guarantees. We introduce a novel composite MDP framework where…

Machine Learning · Statistics 2025-02-04 Jinhang Chai , Elynn Chen , Lin Yang

Mutual Information Based Knowledge Transfer Under State-Action Dimension Mismatch

Deep reinforcement learning (RL) algorithms have achieved great success on a wide variety of sequential decision-making tasks. However, many of these algorithms suffer from high sample complexity when learning from scratch using…

Machine Learning · Statistics 2020-06-15 Michael Wan , Tanmay Gangwani , Jian Peng

A Two stage Adaptive Knowledge Transfer Evolutionary Multi-tasking Based on Population Distribution for Multi/Many-Objective Optimization

Multi-tasking optimization can usually achieve better performance than traditional single-tasking optimization through knowledge transfer between tasks. However, current multi-tasking optimization algorithms have some deficiencies. For high…

Neural and Evolutionary Computing · Computer Science 2021-08-03 Zhengping Liang , Weiqi Liang , Xiuju Xu , Ling Liu , Zexuan Zhu

A Taxonomy of Similarity Metrics for Markov Decision Processes

Although the notion of task similarity is potentially interesting in a wide range of areas such as curriculum learning or automated planning, it has mostly been tied to transfer learning. Transfer is based on the idea of reusing the…

Machine Learning · Computer Science 2021-03-09 Álvaro Visús , Javier García , Fernando Fernández

Target Transfer Q-Learning and Its Convergence Analysis

Q-learning is one of the most popular methods in Reinforcement Learning (RL). Transfer Learning aims to utilize the learned knowledge from source tasks to help new tasks to improve the sample complexity of the new tasks. Considering that…

Machine Learning · Computer Science 2018-09-25 Yue Wang , Qi Meng , Wei Cheng , Yuting Liug , Zhi-Ming Ma , Tie-Yan Liu