Second-Order MPC-Based Distributed Q-Learning

Samuel Mallick; Filippo Airaldi; Azita Dabiri; Bart De Schutter

Second-Order MPC-Based Distributed Q-Learning

Systems and Control 2026-05-07 v2 Systems and Control

Authors: Samuel Mallick , Filippo Airaldi , Azita Dabiri , Bart De Schutter

Abstract

The state of the art for model predictive control (MPC)-based distributed Q-learning is limited to first-order gradient updates of the MPC parameterization. In general, using secondorder information can significantly improve the speed of convergence for learning, allowing the use of higher learning rates without introducing instability. This work presents a second-order extension to MPC-based Q-learning with updates distributed across local agents, relying only on locally available information and neighbor-to-neighbor communication. In simulation the approach is demonstrated to significantly outperform first-order distributed Q-learning.

Keywords

model predictive control machine learning distributed consensus

Cite

@article{arxiv.2511.16424,
  title  = {Second-Order MPC-Based Distributed Q-Learning},
  author = {Samuel Mallick and Filippo Airaldi and Azita Dabiri and Bart De Schutter},
  journal= {arXiv preprint arXiv:2511.16424},
  year   = {2026}
}

Comments

6 pages, 2 figures, published in IFAC World Congress 2026

Second-Order MPC-Based Distributed Q-Learning

Abstract

Keywords

Cite

Comments

Related papers