English
Related papers

Related papers: Data-Efficient Quadratic Q-Learning Using LMIs

200 papers

This paper introduces a novel data-driven approach to design a linear quadratic regulator (LQR) using a reinforcement learning (RL) algorithm that does not require a system model. The key contribution is to perform policy iteration (PI) by…

Systems and Control · Electrical Eng. & Systems 2023-11-20 Soroush Asri , Luis Rodrigues

Recently, reinforcement learning (RL) is receiving more and more attentions due to its successful demonstrations outperforming human performance in certain challenging tasks. In our recent paper `primal-dual Q-learning framework for LQR…

Optimization and Control · Mathematics 2018-11-22 Donghwan Lee , Jianghai Hu

This paper studies data-driven approaches to the continuous-time linear quadratic regulator (LQR) problem based on two existing parameterizations, namely a closed-loop (CL) parameterization from behavioral system theory and an integral…

Optimization and Control · Mathematics 2026-05-01 Armin Gießler , Felix Thömmes , Sören Hohmann

This paper introduces and analyzes an improved Q-learning algorithm for discrete-time linear time-invariant systems. The proposed method does not require any knowledge of the system dynamics, and it enjoys significant efficiency advantages…

Systems and Control · Electrical Eng. & Systems 2023-04-03 Victor G. Lopez , Mohammad Alsalti , Matthias A. Müller

First-order methods for quadratic optimization such as OSQP are widely used for large-scale machine learning and embedded optimal control, where many related problems must be rapidly solved. These methods face two persistent challenges:…

Regularized Markov Decision Processes serve as models of sequential decision making under uncertainty wherein the decision maker has limited information processing capacity and/or aversion to model ambiguity. With functional approximation,…

Artificial Intelligence · Computer Science 2025-02-11 Jiachen Xi , Alfredo Garcia , Petar Momcilovic

The fine-tuning of pre-trained large language models (LLMs) using reinforcement learning (RL) is generally formulated as direct policy optimization. This approach was naturally favored as it efficiently improves a pretrained LLM, seen as an…

Linear-Quadratic (LQ) problems that arise in systems and controls include the classical optimal control problems of the Linear Quadratic Regulator (LQR) in both its deterministic and stochastic forms, as well as $H^\infty$-analysis (the…

Systems and Control · Electrical Eng. & Systems 2024-01-04 Bassam Bamieh

Effectively leveraging large, previously collected datasets in reinforcement learning (RL) is a key challenge for large-scale real-world applications. Offline RL algorithms promise to learn effective policies from previously-collected,…

Machine Learning · Computer Science 2020-08-20 Aviral Kumar , Aurick Zhou , George Tucker , Sergey Levine

In recent years there has been a collective research effort to find new formulations of reinforcement learning that are simultaneously more efficient and more amenable to analysis. This paper concerns one approach that builds on the linear…

Optimization and Control · Mathematics 2022-10-19 Fan Lu , Prashant Mehta , Sean Meyn , Gergely Neu

This paper introduces Meta-Q-Learning (MQL), a new off-policy algorithm for meta-Reinforcement Learning (meta-RL). MQL builds upon three simple ideas. First, we show that Q-learning is competitive with state-of-the-art meta-RL algorithms if…

Machine Learning · Computer Science 2020-04-07 Rasool Fakoor , Pratik Chaudhari , Stefano Soatto , Alexander J. Smola

This paper applies a reinforcement learning (RL) method to solve infinite horizon continuous-time stochastic linear quadratic problems, where drift and diffusion terms in the dynamics may depend on both the state and control. Based on…

Optimization and Control · Mathematics 2021-09-17 Na Li , Xun Li , Jing Peng , Zuo Quan Xu

The goal of robust reinforcement learning (RL) is to learn a policy that is robust against the uncertainty in model parameters. Parameter uncertainty commonly occurs in many real-world RL applications due to simulator modeling errors,…

Machine Learning · Computer Science 2022-10-19 Kishan Panaganti , Zaiyan Xu , Dileep Kalathil , Mohammad Ghavamzadeh

In this paper, two Q-learning (QL) methods are proposed and their convergence theories are established for addressing the model-free optimal control problem of general nonlinear continuous-time systems. By introducing the Q-function for…

Systems and Control · Computer Science 2014-10-14 Biao Luo , Derong Liu , Tingwen Huang

We study reinforcement learning in infinite-horizon discounted Markov decision processes with continuous state spaces, where data are generated online from a single trajectory under a Markovian behavior policy. To avoid maintaining an…

Machine Learning · Computer Science 2026-03-05 Shengbo Wang

In this work, we present a new model-free and off-policy reinforcement learning (RL) algorithm, that is capable of finding a near-optimal policy with state-action observations from arbitrary behavior policies. Our algorithm, called the…

Optimization and Control · Mathematics 2025-07-21 Narim Jeong , Donghwan Lee , Niao He

We propose a new reinforcement learning algorithm derived from a regularized linear-programming formulation of optimal control in MDPs. The method is closely related to the classic Relative Entropy Policy Search (REPS) algorithm of Peters…

Machine Learning · Computer Science 2021-03-01 Joan Bas-Serrano , Sebastian Curi , Andreas Krause , Gergely Neu

Reinforcement learning (RL) for exponential-utility optimization in discounted Markov decision processes (MDPs) lacks principled value-based algorithms. We address this gap in the fixed risk-aversion setting. Building on the Bellman-type…

Machine Learning · Computer Science 2026-05-11 Gugan Thoppe , L. A. Prashanth , Ankur Naskar , Sanjay Bhat

This paper introduces an innovative approach based on policy iteration (PI), a reinforcement learning (RL) algorithm, to obtain an optimal observer with a quadratic cost function. This observer is designed for systems with a given…

Systems and Control · Electrical Eng. & Systems 2023-11-29 Soroush Asri , Luis Rodrigues

In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No…

Systems and Control · Electrical Eng. & Systems 2024-08-21 Mohammad Alsalti , Victor G. Lopez , Matthias A. Müller
‹ Prev 1 2 3 10 Next ›