Related papers: Primitive Agentic First-Order Optimization

Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent States

We design a simple reinforcement learning (RL) agent that implements an optimistic version of $Q$-learning and establish through regret analysis that this agent can operate with some level of competence in any environment. While we leverage…

Machine Learning · Computer Science 2021-07-13 Shi Dong , Benjamin Van Roy , Zhengyuan Zhou

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning

Reinforcement learning (RL) has achieved impressive performance in a variety of online settings in which an agent's ability to query the environment for transitions and rewards is effectively unlimited. However, in many practical…

Machine Learning · Computer Science 2021-05-06 Anurag Ajay , Aviral Kumar , Pulkit Agrawal , Sergey Levine , Ofir Nachum

Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives

Despite the potential of reinforcement learning (RL) for building general-purpose robotic systems, training RL agents to solve robotics tasks still remains challenging due to the difficulty of exploration in purely continuous action spaces.…

Machine Learning · Computer Science 2021-10-29 Murtaza Dalal , Deepak Pathak , Ruslan Salakhutdinov

Reinforcement Learning for Combinatorial Optimization: A Survey

Many traditional algorithms for solving combinatorial optimization problems involve using hand-crafted heuristics that sequentially construct a solution. Such heuristics are designed by domain experts and may often be suboptimal due to the…

Machine Learning · Computer Science 2020-12-25 Nina Mazyavkina , Sergey Sviridov , Sergei Ivanov , Evgeny Burnaev

A First-Occupancy Representation for Reinforcement Learning

Both animals and artificial agents benefit from state representations that support rapid transfer of learning across tasks and which enable them to efficiently traverse their environments to reach rewarding states. The successor…

Machine Learning · Computer Science 2021-11-09 Ted Moskovitz , Spencer R. Wilson , Maneesh Sahani

Reinforcement Learning with Prototypical Representations

Learning effective representations in image-based environments is crucial for sample efficient Reinforcement Learning (RL). Unfortunately, in RL, representation learning is confounded with the exploratory experience of the agent -- learning…

Machine Learning · Computer Science 2021-07-21 Denis Yarats , Rob Fergus , Alessandro Lazaric , Lerrel Pinto

Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification

Conservatism has led to significant progress in offline reinforcement learning (RL) where an agent learns from pre-collected datasets. However, as many real-world scenarios involve interaction among multiple agents, it is important to…

Machine Learning · Computer Science 2022-04-05 Ling Pan , Longbo Huang , Tengyu Ma , Huazhe Xu

Behavior Prior Representation learning for Offline Reinforcement Learning

Offline reinforcement learning (RL) struggles in environments with rich and noisy inputs, where the agent only has access to a fixed dataset without environment interactions. Past works have proposed common workarounds based on the…

Machine Learning · Computer Science 2023-03-01 Hongyu Zang , Xin Li , Jie Yu , Chen Liu , Riashat Islam , Remi Tachet Des Combes , Romain Laroche

SortingEnv: An Extendable RL-Environment for an Industrial Sorting Process

We present a novel reinforcement learning (RL) environment designed to both optimize industrial sorting systems and study agent behavior in evolving spaces. In simulating material flow within a sorting process our environment follows the…

Machine Learning · Computer Science 2025-03-14 Tom Maus , Nico Zengeler , Tobias Glasmachers

Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery

Offline reinforcement learning (RL) enables the agent to effectively learn from logged data, which significantly extends the applicability of RL algorithms in real-world scenarios where exploration can be expensive or unsafe. Previous works…

Machine Learning · Computer Science 2022-12-05 Yiqin Yang , Hao Hu , Wenzhe Li , Siyuan Li , Jun Yang , Qianchuan Zhao , Chongjie Zhang

Policy Optimization in Multi-Agent Settings under Partially Observable Environments

This work leverages adaptive social learning to estimate partially observable global states in multi-agent reinforcement learning (MARL) problems. Unlike existing methods, the proposed approach enables the concurrent operation of social…

Multiagent Systems · Computer Science 2025-08-11 Ainur Zhaikhan , Malek Khammassi , Ali H. Sayed

Preferences Implicit in the State of the World

Reinforcement learning (RL) agents optimize only the features specified in a reward function and are indifferent to anything left out inadvertently. This means that we must not only specify what to do, but also the much larger space of what…

Machine Learning · Computer Science 2019-04-22 Rohin Shah , Dmitrii Krasheninnikov , Jordan Alexander , Pieter Abbeel , Anca Dragan

Offline Reinforcement Learning with Closed-Form Policy Improvement Operators

Behavior constrained policy optimization has been demonstrated to be a successful paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a policy is trained to maximize a learned value function while…

Machine Learning · Computer Science 2023-07-25 Jiachen Li , Edwin Zhang , Ming Yin , Qinxun Bai , Yu-Xiang Wang , William Yang Wang

Where2Start: Leveraging initial States for Robust and Sample-Efficient Reinforcement Learning

The reinforcement learning algorithms that focus on how to compute the gradient and choose next actions, are effectively improved the performance of the agents. However, these algorithms are environment-agnostic. This means that the…

Machine Learning · Computer Science 2023-11-28 Pouya Parsa , Raoof Zare Moayedi , Mohammad Bornosi , Mohammad Mahdi Bejani

A General Approach of Automated Environment Design for Learning the Optimal Power Flow

Reinforcement learning (RL) algorithms are increasingly used to solve the optimal power flow (OPF) problem. Yet, the question of how to design RL environments to maximize training performance remains unanswered, both for the OPF and the…

Machine Learning · Computer Science 2025-05-14 Thomas Wolgast , Astrid Nieße

PerSim: Data-Efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators

We consider offline reinforcement learning (RL) with heterogeneous agents under severe data scarcity, i.e., we only observe a single historical trajectory for every agent under an unknown, potentially sub-optimal policy. We find that the…

Machine Learning · Computer Science 2021-11-11 Anish Agarwal , Abdullah Alomar , Varkey Alumootil , Devavrat Shah , Dennis Shen , Zhi Xu , Cindy Yang

Breaking the Computational Barrier: Provably Efficient Actor-Critic for Low-Rank MDPs

Reinforcement learning (RL) is a fundamental framework for sequential decision-making, in which an agent learns an optimal policy through interactions with an unknown environment. In settings with function approximation, many existing RL…

Machine Learning · Computer Science 2026-05-05 Ruiquan Huang , Donghao Li , Yingbin Liang , Jing Yang

A Canonical Form for First-Order Distributed Optimization Algorithms

We consider the distributed optimization problem in which a network of agents aims to minimize the average of local functions. To solve this problem, several algorithms have recently been proposed where agents perform various combinations…

Optimization and Control · Mathematics 2019-07-16 Akhil Sundararajan , Bryan Van Scoy , Laurent Lessard

Contrastive Initial State Buffer for Reinforcement Learning

In Reinforcement Learning, the trade-off between exploration and exploitation poses a complex challenge for achieving efficient learning from limited samples. While recent works have been effective in leveraging past experiences for policy…

Machine Learning · Computer Science 2024-02-27 Nico Messikommer , Yunlong Song , Davide Scaramuzza

Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information

End-to-end learning robotic manipulation with high data efficiency is one of the key challenges in robotics. The latest methods that utilize human demonstration data and unsupervised representation learning has proven to be a promising…

Robotics · Computer Science 2021-10-22 Jin Li , Xianyuan Zhan , Zixu Xiao , Guyue Zhou