Related papers: Adaptive Discretization for Model-Based Reinforcem…
We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel $Q$-learning policy with adaptive data-driven discretization. The…
Discretization based approaches to solving online reinforcement learning problems have been studied extensively in practice on applications ranging from resource allocation to cache management. Two major questions in designing…
We study reinforcement learning for controlled diffusion processes with unbounded continuous state spaces, bounded continuous actions, and polynomially growing rewards: settings that arise naturally in finance, economics, and operations…
We develop a novel iterative algorithm for locally optimal experimental design under constraints, like budget or performance constraints. It is an adaptive discretization algorithm. In every iteration, a discretized version of the…
We study unconstrained Online Linear Optimization with Lipschitz losses. Motivated by the pursuit of instance optimality, we propose a new algorithm that simultaneously achieves ($i$) the AdaGrad-style second order gradient adaptivity; and…
Learning actions that are relevant to decision-making and can be executed effectively is a key problem in autonomous robotics. Current state-of-the-art action representations in robotics lack proper effect-driven learning of the robot's…
We develop adaptive discretization algorithms for locally optimal experimental design of nonlinear prediction models. With these algorithms, we refine and improve a pertinent state-of-the-art algorithm in various respects. We establish…
Despite the wealth of research into provably efficient reinforcement learning algorithms, most works focus on tabular representation and thus struggle to handle exponentially or infinitely large state-action spaces. In this paper, we…
Strong worst-case performance bounds for episodic reinforcement learning exist but fortunately in practice RL algorithms perform much better than such bounds would predict. Algorithms and theory that provide strong problem-dependent bounds…
In this paper, we study the problem of regret minimization for episodic Reinforcement Learning (RL) both in the model-free and the model-based setting. We focus on learning with general function classes and general model classes, and we…
Lipschitz bandits is a prominent version of multi-armed bandits that studies large, structured action spaces such as the $[0,1]$ interval, where similar actions are guaranteed to have similar rewards. A central theme here is the adaptive…
Episodic self-imitation learning, a novel self-imitation algorithm with a trajectory selection module and an adaptive loss function, is proposed to speed up reinforcement learning. Compared to the original self-imitation learning algorithm,…
Finite element discretizations of problems in computational physics often rely on adaptive mesh refinement (AMR) to preferentially resolve regions containing important features during simulation. However, these spatial refinement strategies…
We present a powerful general framework for designing data-dependent optimization algorithms, building upon and unifying recent techniques in adaptive regularization, optimistic gradient predictions, and problem-dependent randomization. We…
We present an adaptive regularization algorithm that can be effectively applied to the optimization problem in deep learning framework. Our regularization algorithm aims to take into account the fitness of data to the current state of model…
In this paper we investigate an adaptive discretization strategy for ill-posed linear prob- lems combined with a regularization from a class of semiiterative methods. We show that such a discretization approach in combination with a…
Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents…
We present a new adaptive algorithm for learning discrete distributions under distribution drift. In this setting, we observe a sequence of independent samples from a discrete distribution that is changing over time, and the goal is to…
In recent years, deep learning has been connected with optimal control as a way to define a notion of a continuous underlying learning problem. In this view, neural networks can be interpreted as a discretization of a parametric Ordinary…
Episodic control enables sample efficiency in reinforcement learning by recalling past experiences from an episodic memory. We propose a new model-based episodic memory of trajectories addressing current limitations of episodic control. Our…