Related papers: Control with adaptive Q-learning

Single-partition adaptive Q-learning

This paper introduces single-partition adaptive Q-learning (SPAQL), an algorithm for model-free episodic reinforcement learning (RL), which adaptively partitions the state-action space of a Markov decision process (MDP), while…

Machine Learning · Computer Science 2020-07-15 João Pedro Araújo , Mário Figueiredo , Miguel Ayala Botto

Q-learning for Optimal Control of Continuous-time Systems

In this paper, two Q-learning (QL) methods are proposed and their convergence theories are established for addressing the model-free optimal control problem of general nonlinear continuous-time systems. By introducing the Q-function for…

Systems and Control · Computer Science 2014-10-14 Biao Luo , Derong Liu , Tingwen Huang

ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning

Offline Reinforcement Learning (RL), which operates solely on static datasets without further interactions with the environment, provides an appealing alternative to learning a safe and promising control policy. The prevailing methods…

Machine Learning · Computer Science 2025-03-18 Kun Wu , Yinuo Zhao , Zhiyuan Xu , Zhengping Che , Chengxiang Yin , Chi Harold Liu , Feiferi Feng , Jian Tang

Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization

Reinforcement learning (RL) is a classical tool to solve network control or policy optimization problems in unknown environments. The original Q-learning suffers from performance and complexity challenges across very large networks. Herein,…

Machine Learning · Computer Science 2024-09-02 Talha Bozkus , Urbashi Mitra

Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

Reinforcement learning is time-consuming for complex tasks due to the need for large amounts of training data. Recent advances in GPU-based simulation, such as Isaac Gym, have sped up data collection thousands of times on a commodity GPU.…

Machine Learning · Computer Science 2023-07-25 Zechu Li , Tao Chen , Zhang-Wei Hong , Anurag Ajay , Pulkit Agrawal

Policy Optimization in Multi-Agent Settings under Partially Observable Environments

This work leverages adaptive social learning to estimate partially observable global states in multi-agent reinforcement learning (MARL) problems. Unlike existing methods, the proposed approach enables the concurrent operation of social…

Multiagent Systems · Computer Science 2025-08-11 Ainur Zhaikhan , Malek Khammassi , Ali H. Sayed

Automaton Constrained Q-Learning

Real-world robotic tasks often require agents to achieve sequences of goals while respecting time-varying safety constraints. However, standard Reinforcement Learning (RL) paradigms are fundamentally limited in these settings. A natural…

Robotics · Computer Science 2025-12-02 Anastasios Manganaris , Vittorio Giammarino , Ahmed H. Qureshi

A Deep Reinforcement Learning Architecture for Multi-stage Optimal Control

Deep reinforcement learning for high dimensional, hierarchical control tasks usually requires the use of complex neural networks as functional approximators, which can lead to inefficiency, instability and even divergence in the training…

Machine Learning · Computer Science 2019-11-26 Yuguang Yang

Comparison of Reinforcement Learning algorithms applied to the Cart Pole problem

Designing optimal controllers continues to be challenging as systems are becoming complex and are inherently nonlinear. The principal advantage of reinforcement learning (RL) is its ability to learn from the interaction with the environment…

Machine Learning · Computer Science 2018-10-05 Savinay Nagendra , Nikhil Podila , Rashmi Ugarakhod , Koshy George

Q-Learning in enormous action spaces via amortized approximate maximization

Applying Q-learning to high-dimensional or continuous action spaces can be difficult due to the required maximization over the set of possible actions. Motivated by techniques from amortized inference, we replace the expensive maximization…

Machine Learning · Computer Science 2020-01-23 Tom Van de Wiele , David Warde-Farley , Andriy Mnih , Volodymyr Mnih

Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks

We study reinforcement learning (RL) for learning a Quantal Stackelberg Equilibrium (QSE) in an episodic Markov game with a leader-follower structure. In specific, at the outset of the game, the leader announces her policy to the follower…

Machine Learning · Computer Science 2023-07-27 Siyu Chen , Mengdi Wang , Zhuoran Yang

Phase-Aware Policy Learning for Skateboard Riding of Quadruped Robots via Feature-wise Linear Modulation

Skateboards offer a compact and efficient means of transportation as a type of personal mobility device. However, controlling them with legged robots poses several challenges for policy learning due to perception-driven interactions and…

Robotics · Computer Science 2026-04-22 Minsung Yoon , Jeil Jeong , Sung-Eui Yoon

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Learning and planning in partially-observable domains is one of the most difficult problems in reinforcement learning. Traditional methods consider these two problems as independent, resulting in a classical two-stage paradigm: first learn…

Artificial Intelligence · Computer Science 2019-11-25 Tianyu Li , Bogdan Mazoure , Doina Precup , Guillaume Rabusseau

Q-SpiRL: Quantum Spiking Reinforcement Learning for Adaptive Robot Navigation

Adaptive robot navigation in dynamic environments requires policies that can reach the target reliably while producing efficient and stable trajectories. This paper presents Q-SpiRL, a quantum spiking reinforcement learning framework for…

Robotics · Computer Science 2026-05-21 Mohamed Khair Altrabulsi , Nouhaila Innan , Alberto Marchisio , Muhammad Kashif , Muhammad Shafique

Approximate information state based convergence analysis of recurrent Q-learning

In spite of the large literature on reinforcement learning (RL) algorithms for partially observable Markov decision processes (POMDPs), a complete theoretical understanding is still lacking. In a partially observable setting, the history of…

Machine Learning · Computer Science 2023-06-12 Erfan Seyedsalehi , Nima Akbarzadeh , Amit Sinha , Aditya Mahajan

Offline RL for Adaptive Policy Retrieval in Prior Authorization

Prior authorization (PA) requires interpretation of complex and fragmented coverage policies, yet existing retrieval-augmented systems rely on static top-$K$ strategies with fixed numbers of retrieved sections. Such fixed retrieval can be…

Information Retrieval · Computer Science 2026-04-08 Ruslan Sharifullin , Maxim Gorshkov , Hannah Clay

Q-Learning for Robust Satisfaction of Signal Temporal Logic Specifications

This paper addresses the problem of learning optimal policies for satisfying signal temporal logic (STL) specifications by agents with unknown stochastic dynamics. The system is modeled as a Markov decision process, in which the states…

Systems and Control · Computer Science 2016-09-26 Derya Aksaray , Austin Jones , Zhaodan Kong , Mac Schwager , Calin Belta

Unsynchronized Decentralized Q-Learning: Two Timescale Analysis By Persistence

Non-stationarity is a fundamental challenge in multi-agent reinforcement learning (MARL), where agents update their behaviour as they learn. Many theoretical advances in MARL avoid the challenge of non-stationarity by coordinating the…

Computer Science and Game Theory · Computer Science 2025-03-19 Bora Yongacoglu , Gürdal Arslan , Serdar Yüksel

A Reinforcement Learning Framework for Some Singular Stochastic Control Problems

We develop a continuous-time reinforcement learning framework for a class of singular stochastic control problems without entropy regularization. The optimal singular control is characterized as the optimal singular control law, which is a…

Optimization and Control · Mathematics 2026-05-14 Zongxia Liang , Xiaodong Luo , Xiang Yu

Stochastic Primal-Dual Q-Learning

In this work, we present a new model-free and off-policy reinforcement learning (RL) algorithm, that is capable of finding a near-optimal policy with state-action observations from arbitrary behavior policies. Our algorithm, called the…

Optimization and Control · Mathematics 2025-07-21 Narim Jeong , Donghwan Lee , Niao He