Related papers: Recursive Agent Optimization

Recursive Reasoning Graph for Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) provides an efficient way for simultaneously learning policies for multiple agents interacting with each other. However, in scenarios requiring complex interactions, existing algorithms can suffer…

Machine Learning · Computer Science 2022-03-08 Xiaobai Ma , David Isele , Jayesh K. Gupta , Kikuo Fujimura , Mykel J. Kochenderfer

Experience Replay Optimization

Experience replay enables reinforcement learning agents to memorize and reuse past experiences, just as humans replay memories for the situation at hand. Contemporary off-policy algorithms either replay past experiences uniformly or utilize…

Machine Learning · Computer Science 2019-06-21 Daochen Zha , Kwei-Herng Lai , Kaixiong Zhou , Xia Hu

RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization

Agentic Reinforcement Learning (Agentic RL) has shown remarkable potential in large language model-based (LLM) agents. These works can empower LLM agents to tackle complex tasks via multi-step, tool-integrated reasoning. However, an…

Artificial Intelligence · Computer Science 2026-03-04 Siwei Zhang , Yun Xiong , Xi Chen , Zi'an Jia , Renhong Huang , Jiarong Xu , Jiawei Zhang

Multi-Agent Reinforcement Learning for Resources Allocation Optimization: A Survey

Multi-Agent Reinforcement Learning (MARL) has become a powerful framework for numerous real-world applications, modeling distributed decision-making and learning from interactions with complex environments. Resource Allocation Optimization…

Multiagent Systems · Computer Science 2025-05-01 Mohamad A. Hady , Siyi Hu , Mahardhika Pratama , Jimmy Cao , Ryszard Kowalczyk

Recursive Multi-Agent Systems

Recursive or looped language models have recently emerged as a new scaling axis by iteratively refining the same model computation over latent states to deepen reasoning. We extend such scaling principle from a single model to multi-agent…

Artificial Intelligence · Computer Science 2026-04-29 Xiyuan Yang , Jiaru Zou , Rui Pan , Ruizhong Qiu , Pan Lu , Shizhe Diao , Jindong Jiang , Hanghang Tong , Tong Zhang , Markus J. Buehler , Jingrui He , James Zou

MARO: Learning Stronger Reasoning from Social Interaction

Humans face countless scenarios that require reasoning and judgment in daily life. However, existing large language model training methods primarily allow models to learn from existing textual content or solve predetermined problems,…

Artificial Intelligence · Computer Science 2026-01-27 Yin Cai , Zhouhong Gu , Juntao Zhang , Ping Chen

PARCO: Parallel AutoRegressive Models for Multi-Agent Combinatorial Optimization

Combinatorial optimization problems involving multiple agents are notoriously challenging due to their NP-hard nature and the necessity for effective agent coordination. Despite advancements in learning-based methods, existing approaches…

Multiagent Systems · Computer Science 2025-10-23 Federico Berto , Chuanbo Hua , Laurin Luttmann , Jiwoo Son , Junyoung Park , Kyuree Ahn , Changhyun Kwon , Lin Xie , Jinkyoo Park

MARPO: A Reflective Policy Optimization for Multi Agent Reinforcement Learning

We propose Multi Agent Reflective Policy Optimization (MARPO) to alleviate the issue of sample inefficiency in multi agent reinforcement learning. MARPO consists of two key components: a reflection mechanism that leverages subsequent…

Multiagent Systems · Computer Science 2025-12-30 Cuiling Wu , Yaozhong Gan , Junliang Xing , Ying Fu

Reinforced Collaboration in Multi-Agent Flow Networks

Multi-agent systems provide a powerful way to extend large language models (LLMs) by decomposing a complex task into specialized subtasks handled by different agents. However, their performance is often hindered by error propagation,…

Machine Learning · Computer Science 2026-05-14 Zheng Wang , Yuang Liu , Yangkai Ding

Selective Reincarnation: Offline-to-Online Multi-Agent Reinforcement Learning

'Reincarnation' in reinforcement learning has been proposed as a formalisation of reusing prior computation from past experiments when training an agent in an environment. In this paper, we present a brief foray into the paradigm of…

Artificial Intelligence · Computer Science 2024-10-31 Claude Formanek , Callum Rhys Tilbury , Jonathan Shock , Kale-ab Tessera , Arnu Pretorius

Decentralized scheduling through an adaptive, trading-based multi-agent system

In multi-agent reinforcement learning systems, the actions of one agent can have a negative impact on the rewards of other agents. One way to combat this problem is to let agents trade their rewards amongst each other. Motivated by this,…

Artificial Intelligence · Computer Science 2022-07-25 Michael Kölle , Lennart Rietdorf , Kyrill Schmid

Reinforcement Learning in Economics and Finance

Reinforcement learning algorithms describe how an agent can learn an optimal action policy in a sequential decision process, through repeated experience. In a given environment, the agent policy provides him some running and terminal…

Theoretical Economics · Economics 2020-03-24 Arthur Charpentier , Romuald Elie , Carl Remlinger

Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization

Advances in reinforcement learning (RL) often rely on massive compute resources and remain notoriously sample inefficient. In contrast, the human brain is able to efficiently learn effective control strategies using limited resources. This…

Machine Learning · Computer Science 2024-01-30 Burcu Küçükoğlu , Walraaf Borkent , Bodo Rueckauer , Nasir Ahmad , Umut Güçlü , Marcel van Gerven

Automating Reinforcement Learning with Example-based Resets

Deep reinforcement learning has enabled robots to learn motor skills from environmental interactions with minimal to no prior knowledge. However, existing reinforcement learning algorithms assume an episodic setting, in which the agent…

Machine Learning · Computer Science 2022-05-27 Jigang Kim , J. hyeon Park , Daesol Cho , H. Jin Kim

Backward Curriculum Reinforcement Learning

Current reinforcement learning algorithms train an agent using forward-generated trajectories, which provide little guidance so that the agent can explore as much as possible. While realizing the value of reinforcement learning results from…

Artificial Intelligence · Computer Science 2023-09-06 KyungMin Ko

When should I search more: Adaptive Complex Query Optimization with Reinforcement Learning

Query optimization is a crucial component for the efficacy of Retrieval-Augmented Generation (RAG) systems. While reinforcement learning (RL)-based agentic and reasoning methods have recently emerged as a promising direction on query…

Artificial Intelligence · Computer Science 2026-01-30 Wei Wen , Sihang Deng , Tianjun Wei , Keyu Chen , Ruizhi Qiao , Xing Sun

RRO: LLM Agent Optimization Through Rising Reward Trajectories

Large language models (LLMs) have exhibited extraordinary performance in a variety of tasks while it remains challenging for them to solve complex multi-step tasks as agents. In practice, agents sensitive to the outcome of certain key steps…

Artificial Intelligence · Computer Science 2025-05-28 Zilong Wang , Jingfeng Yang , Sreyashi Nag , Samarth Varshney , Xianfeng Tang , Haoming Jiang , Jingbo Shang , Sheikh Muhammad Sarwar

Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization

Proactive large language model (LLM) agents aim to actively plan, query, and interact over multiple turns, enabling efficient task completion beyond passive instruction following and making them essential for real-world, user-centric…

Artificial Intelligence · Computer Science 2026-02-13 Yihang Yao , Zhepeng Cen , Haohong Lin , Shiqi Liu , Zuxin Liu , Jiacheng Zhu , Zhang-Wei Hong , Laixi Shi , Ding Zhao

AMAGO: Scalable In-Context Reinforcement Learning for Adaptive Agents

We introduce AMAGO, an in-context Reinforcement Learning (RL) agent that uses sequence models to tackle the challenges of generalization, long-term memory, and meta-learning. Recent works have shown that off-policy learning can make…

Machine Learning · Computer Science 2024-02-02 Jake Grigsby , Linxi Fan , Yuke Zhu

RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation

Although multi-agent systems based on large language models show strong capabilities on multiple tasks, they are still limited by high computational overhead, information loss, and robustness. Inspired by ResNet's residual learning, we…

Artificial Intelligence · Computer Science 2025-06-02 Zhentao Xie , Chengcheng Han , Jinxin Shi , Wenjun Cui , Xin Zhao , Xingjiao Wu , Jiabao Zhao