Related papers: Continuous-time Value Function Approximation in Re…

Multi-task Reinforcement Learning in Reproducing Kernel Hilbert Spaces via Cross-learning

Reinforcement learning (RL) is a framework to optimize a control policy using rewards that are revealed by the system as a response to a control action. In its standard form, RL involves a single agent that uses its policy to accomplish a…

Systems and Control · Electrical Eng. & Systems 2021-11-24 Juan Cervino , Juan Andres Bazerque , Miguel Calvo-Fullana , Alejandro Ribeiro

Continuous Control with Coarse-to-fine Reinforcement Learning

Despite recent advances in improving the sample-efficiency of reinforcement learning (RL) algorithms, designing an RL algorithm that can be practically deployed in real-world environments remains a challenge. In this paper, we present…

Robotics · Computer Science 2024-07-11 Younggyo Seo , Jafar Uruç , Stephen James

Value Function Approximations via Kernel Embeddings for No-Regret Reinforcement Learning

We consider the regret minimization problem in reinforcement learning (RL) in the episodic setting. In many real-world RL environments, the state and action spaces are continuous or very large. Existing approaches establish regret…

Machine Learning · Computer Science 2022-06-29 Sayak Ray Chowdhury , Rafael Oliveira

Reinforcement Learning for Infinite-Dimensional Systems

Interest in reinforcement learning (RL) for large-scale systems, comprising extensive populations of intelligent agents interacting with heterogeneous environments, has surged significantly across diverse scientific domains in recent years.…

Systems and Control · Electrical Eng. & Systems 2025-09-16 Wei Zhang , Jr-Shin Li

Rates of Convergence in Certain Native Spaces of Approximations used in Reinforcement Learning

This paper studies convergence rates for some value function approximations that arise in a collection of reproducing kernel Hilbert spaces (RKHS) $H(\Omega)$. By casting an optimal control problem in a specific class of native spaces,…

Systems and Control · Electrical Eng. & Systems 2023-11-20 Ali Bouland , Shengyuan Niu , Sai Tej Paruchuri , Andrew Kurdila , John Burns , Eugenio Schuster

Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation

Continuous-time reinforcement learning (CTRL) provides a principled framework for sequential decision-making in environments where interactions evolve continuously over time. Despite its empirical success, the theoretical understanding of…

Machine Learning · Computer Science 2025-05-22 Runze Zhao , Yue Yu , Adams Yiyue Zhu , Chen Yang , Dongruo Zhou

Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning

Existing reinforcement learning (RL) methods struggle with complex dynamical systems that demand interactions at high frequencies or irregular time intervals. Continuous-time RL (CTRL) has emerged as a promising alternative by replacing…

Machine Learning · Computer Science 2026-02-20 Xuefeng Wang , Lei Zhang , Henglin Pu , Ahmed H. Qureshi , Husheng Li

On Bellman equations for continuous-time policy evaluation I: discretization and approximation

We study the problem of computing the value function from a discretely-observed trajectory of a continuous-time diffusion process. We develop a new class of algorithms based on easily implementable numerical schemes that are compatible with…

Machine Learning · Computer Science 2024-07-09 Wenlong Mou , Yuhua Zhu

Continuous-time reinforcement learning: ellipticity enables model-free value function approximation

We study off-policy reinforcement learning for controlling continuous-time Markov diffusion processes with discrete-time observations and actions. We consider model-free algorithms with function approximation that learn value and advantage…

Machine Learning · Computer Science 2026-04-17 Wenlong Mou

Deep Reinforcement Learning with Adjustments

Deep reinforcement learning (RL) algorithms can learn complex policies to optimize agent operation over time. RL algorithms have shown promising results in solving complicated problems in recent years. However, their application on…

Machine Learning · Computer Science 2021-09-29 Hamed Khorasgani , Haiyan Wang , Chetan Gupta , Susumu Serita

Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces

We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based…

Machine Learning · Computer Science 2020-10-16 Bogdan Mazoure , Thang Doan , Tianyu Li , Vladimir Makarenkov , Joelle Pineau , Doina Precup , Guillaume Rabusseau

ACERAC: Efficient reinforcement learning in fine time discretization

One of the main goals of reinforcement learning (RL) is to provide a~way for physical machines to learn optimal behavior instead of being programmed. However, effective control of the machines usually requires fine time discretization. The…

Machine Learning · Computer Science 2022-07-12 Jakub Łyskawa , Paweł Wawrzyński

Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework

We approach the continuous-time mean-variance (MV) portfolio selection with reinforcement learning (RL). The problem is to achieve the best tradeoff between exploration and exploitation, and is formulated as an entropy-regularized, relaxed…

Portfolio Management · Quantitative Finance 2019-05-07 Haoran Wang , Xun Yu Zhou

TDMPBC: Self-Imitative Reinforcement Learning for Humanoid Robot Control

Complex high-dimensional spaces with high Degree-of-Freedom and complicated action spaces, such as humanoid robots equipped with dexterous hands, pose significant challenges for reinforcement learning (RL) algorithms, which need to wisely…

Robotics · Computer Science 2025-02-25 Zifeng Zhuang , Diyuan Shi , Runze Suo , Xiao He , Hongyin Zhang , Ting Wang , Shangke Lyu , Donglin Wang

A Reproducing Kernel Hilbert Space Approach to Functional Calibration of Computer Models

This paper develops a frequentist solution to the functional calibration problem, where the value of a calibration parameter in a computer model is allowed to vary with the value of control variables in the physical system. The need of…

Methodology · Statistics 2021-07-20 Rui Tuo , Shiyuan He , Arash Pourhabib , Yu Ding , Jianhua Z. Huang

Operator-Guided Invariance Learning for Continuous Reinforcement Learning

Reinforcement learning (RL) with continuous time and state/action spaces is often data-intensive and brittle under nuisance variability and shift, motivating methods that exploit value-preserving structures to stabilize and improve…

Machine Learning · Computer Science 2026-05-08 Zuyuan Zhang , Fei Xu Yu , Tian Lan

Control Regularization for Reduced Variance Reinforcement Learning

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on…

Machine Learning · Computer Science 2019-05-15 Richard Cheng , Abhinav Verma , Gabor Orosz , Swarat Chaudhuri , Yisong Yue , Joel W. Burdick

On the Design of Safe Continual RL Methods for Control of Nonlinear Systems

Reinforcement learning (RL) algorithms have been successfully applied to control tasks associated with unmanned aerial vehicles and robotics. In recent years, safe RL has been proposed to allow the safe execution of RL algorithms in…

Machine Learning · Computer Science 2025-02-25 Austin Coursey , Marcos Quinones-Grueiro , Gautam Biswas

Replicable Reinforcement Learning with Linear Function Approximation

Replication of experimental results has been a challenge faced by many scientific disciplines, including the field of machine learning. Recent work on the theory of machine learning has formalized replicability as the demand that an…

Machine Learning · Computer Science 2026-04-15 Eric Eaton , Marcel Hussing , Michael Kearns , Aaron Roth , Sikata Bela Sengupta , Jessica Sorrell

Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning

Reinforcement learning from human feedback (RLHF), which aligns a diffusion model with input prompt, has become a crucial step in building reliable generative AI models. Most works in this area use a discrete-time formulation, which is…

Machine Learning · Computer Science 2025-08-25 Hanyang Zhao , Haoxian Chen , Ji Zhang , David D. Yao , Wenpin Tang