Related papers: Robust Q-learning
The $Q$-learning algorithm is a simple and widely-used stochastic approximation scheme for reinforcement learning, but the basic protocol can exhibit instability in conjunction with function approximation. Such instability can be observed…
Robust reinforcement learning (RL) is to find a policy that optimizes the worst-case performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust RL, where the uncertainty set is defined to be centering at a…
In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that…
A dynamic treatment regime effectively incorporates both accrued information and long-term effects of treatment from specially designed clinical trials. As these become more and more popular in conjunction with longitudinal data from…
Precision medicine aims to tailor therapeutic decisions to individual patient characteristics. This objective is commonly formalized through dynamic treatment regimes, which use statistical and machine learning methods to derive sequential…
Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these…
Q-learning is a widely used reinforcement learning technique for solving path planning problems. It primarily involves the interaction between an agent and its environment, enabling the agent to learn an optimal strategy that maximizes…
We propose a new Q-learning variant, called 2RA Q-learning, that addresses some weaknesses of existing Q-learning methods in a principled manner. One such weakness is an underlying estimation bias which cannot be controlled and often…
The effectiveness of supervised learning techniques has made them ubiquitous in research and practice. In high-dimensional settings, supervised learning commonly relies on dimensionality reduction to improve performance and identify the…
Robust reinforcement learning (RRL) aims at seeking a robust policy to optimize the worst case performance over an uncertainty set of Markov decision processes (MDPs). This set contains some perturbed MDPs from a nominal MDP (N-MDP) that…
In real-world healthcare settings, treatment decisions often involve optimizing for multivariate outcomes such as treatment efficacy and severity of side effects based on individual preferences. However, existing statistical methods for…
Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL). However, the…
Despite decades of research and recent progress in adaptive control and reinforcement learning, there remains a fundamental lack of understanding in designing controllers that provide robustness to inherent non-asymptotic uncertainties…
Linear regression is a data analysis technique, which is categorized as supervised learning. By utilizing known data, we can predict unknown data. Recently, researchers have explored the use of quantum annealing (QA) to perform linear…
Quantile regression has demonstrated promising utility in longitudinal data analysis. Existing work is primarily focused on modeling cross-sectional outcomes, while outcome trajectories often carry more substantive information in practice.…
Recurrent Neural networks (RNN) have shown promising potential for learning dynamics of sequential data. However, artificial neural networks are known to exhibit poor robustness in presence of input noise, where the sequential architecture…
The goal of robust reinforcement learning (RL) is to learn a policy that is robust against the uncertainty in model parameters. Parameter uncertainty commonly occurs in many real-world RL applications due to simulator modeling errors,…
While contemporary reinforcement learning research and applications have embraced policy gradient methods as the panacea of solving learning problems, value-based methods can still be useful in many domains as long as we can wrangle with…
Neural networks achieve outstanding accuracy in classification and regression tasks. However, understanding their behavior still remains an open challenge that requires questions to be addressed on the robustness, explainability and…
In this paper, we place deep Q-learning into a control-oriented perspective and study its learning dynamics with well-established techniques from robust control. We formulate an uncertain linear time-invariant model by means of the neural…