Related papers: Reinforcement Learning-Based Automatic Berthing Sy…

Research on Autonomous Robots Navigation based on Reinforcement Learning

Reinforcement learning continuously optimizes decision-making based on real-time feedback reward signals through continuous interaction with the environment, demonstrating strong adaptive and self-learning capabilities. In recent years, it…

Robotics · Computer Science 2024-08-15 Zixiang Wang , Hao Yan , Yining Wang , Zhengjia Xu , Zhuoyue Wang , Zhizhong Wu

Attention and Risk-Aware Decision Framework for Safe Autonomous Driving

Autonomous driving has attracted great interest due to its potential capability in full-unsupervised driving. Model-based and learning-based methods are widely used in autonomous driving. Model-based methods rely on pre-defined models of…

Robotics · Computer Science 2025-09-10 Zhen Tian , Fujiang Yuan , Yangfan He , Qinghao Li , Changlin Chen , Huilin Chen , Tianxiang Xu , Jianyu Duan , Yanhong Peng , Zhihao Lin

Robust active flow control over a range of Reynolds numbers using an artificial neural network trained through deep reinforcement learning

This paper focuses on the active flow control of a computational fluid dynamics simulation over a range of Reynolds numbers using deep reinforcement learning (DRL). More precisely, the proximal policy optimization (PPO) method is used to…

Fluid Dynamics · Physics 2020-06-24 Hongwei Tang , Jean Rabault , Alexander Kuhnle , Yan Wang , Tongguang Wang

Adversarial Policy Optimization in Deep Reinforcement Learning

The policy represented by the deep neural network can overfit the spurious features in observations, which hamper a reinforcement learning agent from learning effective policy. This issue becomes severe in high-dimensional state, where the…

Machine Learning · Computer Science 2023-05-01 Md Masudur Rahman , Yexiang Xue

Digital Twin Supervised Reinforcement Learning Framework for Autonomous Underwater Navigation

Autonomous navigation in underwater environments remains a major challenge due to the absence of GPS, degraded visibility, and the presence of submerged obstacles. This article investigates these issues through the case of the BlueROV2, an…

Machine Learning · Computer Science 2025-12-12 Zamirddine Mari , Mohamad Motasem Nawaf , Pierre Drap

Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain

Reinforcement learning (RL) has enabled robust quadruped locomotion over complex terrain, but most learned controllers are trained offline with backpropagation in massively parallel simulation and deployed as fixed policies, limiting…

Neural and Evolutionary Computing · Computer Science 2026-05-12 Zhuangyu Han , Abhronil Sengupta

LiDAR-based drone navigation with reinforcement learning

Reinforcement learning is of increasing importance in the field of robot control and simulation plays a~key role in this process. In the unmanned aerial vehicles (UAVs, drones), there is also an increase in the number of published…

Robotics · Computer Science 2023-07-27 Pawel Miera , Hubert Szolc , Tomasz Kryjak

Autonomous UAV Flight Navigation in Confined Spaces: A Reinforcement Learning Approach

Autonomous UAV inspection of confined industrial infrastructure, such as ventilation ducts, demands robust navigation policies where collisions are unacceptable. While Deep Reinforcement Learning (DRL) offers a powerful paradigm for…

Robotics · Computer Science 2025-12-19 Marco S. Tayar , Lucas K. de Oliveira , Felipe Andrade G. Tommaselli , Juliano D. Negri , Thiago H. Segreto , Ricardo V. Godoy , Marcelo Becker

PTR-PPO: Proximal Policy Optimization with Prioritized Trajectory Replay

On-policy deep reinforcement learning algorithms have low data utilization and require significant experience for policy improvement. This paper proposes a proximal policy optimization algorithm with prioritized trajectory replay (PTR-PPO)…

Machine Learning · Computer Science 2021-12-09 Xingxing Liang , Yang Ma , Yanghe Feng , Zhong Liu

Queueing Network Controls via Deep Reinforcement Learning

Novel advanced policy gradient (APG) methods, such as Trust Region policy optimization and Proximal policy optimization (PPO), have become the dominant reinforcement learning algorithms because of their ease of implementation and good…

Optimization and Control · Mathematics 2022-03-22 J. G. Dai , Mark Gluzman

ANO: A Principled Approach to Robust Policy Optimization

Proximal Policy Optimization (PPO) dominates reinforcement learning and LLM alignment but relies on a "hard clipping" mechanism that discards valuable gradients. Conversely, unconstrained methods like SPO expose the optimization to…

Artificial Intelligence · Computer Science 2026-05-07 Yiheng Zhang , Yiming Wang , Kaiyan Zhao , Zhenglin Wan , Jiayu Chen , Leong Hou U

ContractionPPO: Certified Reinforcement Learning via Differentiable Contraction Layers

Legged locomotion in unstructured environments demands not only high-performance control policies but also formal guarantees to ensure robustness under perturbations. Control methods often require carefully designed reference trajectories,…

Robotics · Computer Science 2026-03-23 Vrushabh Zinage , Narek Harutyunyan , Eric Verheyden , Fred Y. Hadaegh , Soon-Jo Chung

On-Policy RL with Optimal Reward Baseline

Reinforcement learning algorithms are fundamental to align large language models with human preferences and to enhance their reasoning capabilities. However, current reinforcement learning algorithms often suffer from training instability…

Machine Learning · Computer Science 2025-06-05 Yaru Hao , Li Dong , Xun Wu , Shaohan Huang , Zewen Chi , Furu Wei

Autonomous Six-Degree-of-Freedom Spacecraft Docking Maneuvers via Reinforcement Learning

A policy for six-degree-of-freedom docking maneuvers is developed through reinforcement learning and implemented as a feedback control law. Reinforcement learning provides a potential framework for robust, autonomous maneuvers in uncertain…

Systems and Control · Electrical Eng. & Systems 2020-08-10 Charles E. Oestreich , Richard Linares , Ravi Gondhalekar

Proximal Policy Optimization for Tracking Control Exploiting Future Reference Information

In recent years, reinforcement learning (RL) has gained increasing attention in control engineering. Especially, policy gradient methods are widely used. In this work, we improve the tracking performance of proximal policy optimization…

Machine Learning · Computer Science 2021-07-21 Jana Mayer , Johannes Westermann , Juan Pedro Gutiérrez H. Muriedas , Uwe Mettin , Alexander Lampe

AAPO: Enhancing the Reasoning Capabilities of LLMs with Advantage Margin

Reinforcement learning (RL) has emerged as an effective approach for enhancing the reasoning capabilities of large language models (LLMs), especially in scenarios where supervised fine-tuning (SFT) falls short due to limited…

Machine Learning · Computer Science 2026-04-15 Jian Xiong , Jingbo Zhou , Jingyong Ye , Qiang Huang , Dejing Dou

Neural-based Control for CubeSat Docking Maneuvers

Autonomous Rendezvous and Docking (RVD) have been extensively studied in recent years, addressing the stringent requirements of spacecraft dynamics variations and the limitations of GNC systems. This paper presents an innovative approach…

Machine Learning · Computer Science 2024-10-22 Matteo Stoisa , Federica Paganelli Azza , Luca Romanelli , Mattia Varile

Efficient Real-World Autonomous Racing via Attenuated Residual Policy Optimization

Residual policy learning (RPL), in which a learned policy refines a static base policy using deep reinforcement learning (DRL), has shown strong performance across various robotic applications. Its effectiveness is particularly evident in…

Robotics · Computer Science 2026-03-16 Raphael Trumpp , Denis Hoornaert , Mirco Theile , Marco Caccamo

Safe Reinforcement Learning for Autonomous Vehicles through Parallel Constrained Policy Optimization

Reinforcement learning (RL) is attracting increasing interests in autonomous driving due to its potential to solve complex classification and control problems. However, existing RL algorithms are rarely applied to real vehicles for two…

Machine Learning · Computer Science 2020-03-04 Lu Wen , Jingliang Duan , Shengbo Eben Li , Shaobing Xu , Huei Peng

Bounded Ratio Reinforcement Learning

Proximal Policy Optimization (PPO) has become the predominant algorithm for on-policy reinforcement learning due to its scalability and empirical robustness across domains. However, there is a significant disconnect between the underlying…

Machine Learning · Computer Science 2026-05-01 Yunke Ao , Le Chen , Bruce D. Lee , Assefa S. Wahd , Aline Czarnobai , Philipp Fürnstahl , Bernhard Schölkopf , Andreas Krause