English
Related papers

Related papers: Submodular Reinforcement Learning

200 papers

In Reinforcement Learning (abbreviated as RL), an agent interacts with the environment via a set of possible actions, and a reward is generated from some unknown distribution. The task here is to find an optimal set of actions such that the…

Machine Learning · Computer Science 2025-07-21 Aditi Anand , Suman Banerjee , Dildar Ali

In classic Reinforcement Learning (RL), the agent maximizes an additive objective of the visited states, e.g., a value function. Unfortunately, objectives of this type cannot model many real-world applications such as experiment design,…

Machine Learning · Computer Science 2024-07-16 Riccardo De Santi , Manish Prajapat , Andreas Krause

Reinforcement learning (RL) with sparse and deceptive rewards is challenging because non-zero rewards are rarely obtained. Hence, the gradient calculated by the agent can be stochastic and without valid information. Recent studies that…

Machine Learning · Computer Science 2024-02-08 Guojian Wang , Faguo Wu , Xiao Zhang , Jianxiang Liu

In this paper, we study cooperative multi-agent reinforcement learning (MARL) where the joint reward exhibits submodularity, which is a natural property capturing diminishing marginal returns when adding agents to a team. Unlike standard…

Machine Learning · Computer Science 2026-03-10 Wenjing Chen , Chengyuan Qian , Shuo Xing , Yi Zhou , Victoria Crawford

Although in recent years reinforcement learning has become very popular the number of successful applications to different kinds of operations research problems is rather scarce. Reinforcement learning is based on the well-studied dynamic…

Machine Learning · Computer Science 2020-04-03 Manuel Schneckenreither

Offline reinforcement learning (RL) refers to the problem of learning policies entirely from a large batch of previously collected data. This problem setting offers the promise of utilizing such datasets to acquire policies without any…

Machine Learning · Computer Science 2020-11-24 Tianhe Yu , Garrett Thomas , Lantao Yu , Stefano Ermon , James Zou , Sergey Levine , Chelsea Finn , Tengyu Ma

Exploration is essential in reinforcement learning as an agent relies on trial and error to learn an optimal policy. However, when rewards are sparse, naive exploration strategies, like noise injection, are often insufficient. Intrinsic…

Machine Learning · Computer Science 2026-01-30 Minjae Cho , Huy Trong Tran

Single-trajectory reinforcement learning (RL) methods aim to optimize policies from datasets consisting of (prompt, response, reward) triplets, where scalar rewards are directly available. This supervision format is highly practical, as it…

Machine Learning · Computer Science 2025-12-23 Bilal Faye , Hanane Azzag , Mustapha Lebbah

Mastering deep reinforcement learning (DRL) proves challenging in tasks featuring scant rewards. These limited rewards merely signify whether the task is partially or entirely accomplished, necessitating various exploration actions before…

Machine Learning · Computer Science 2024-04-11 Guojian Wang , Faguo Wu , Xiao Zhang

Safe Reinforcement Learning (Safe RL) aims to train an RL agent to maximize its performance in real-world environments while adhering to safety constraints, as exceeding safety violation limits can result in severe consequences. In this…

Machine Learning · Computer Science 2025-04-07 Hanping Zhang , Yuhong Guo

This thesis develops theoretical frameworks and algorithms that advance constrained reinforcement learning (RL) across control, preference learning, and alignment of large language models. The first contribution addresses constrained Markov…

Machine Learning · Computer Science 2025-12-12 Akhil Agnihotri

While originally developed for continuous control problems, Proximal Policy Optimization (PPO) has emerged as the work-horse of a variety of reinforcement learning (RL) applications, including the fine-tuning of generative models.…

Few-step diffusion models enable efficient high-resolution image synthesis but struggle to align with specific downstream objectives due to limitations of existing reinforcement learning (RL) methods in low-step regimes with limited state…

Machine Learning · Computer Science 2026-03-02 Ziyi Zhang , Li Shen , Sen Zhang , Deheng Ye , Yong Luo , Miaojing Shi , Dongjing Shan , Bo Du , Dacheng Tao

Exploration in reinforcement learning (RL) remains an open challenge. RL algorithms rely on observing rewards to train the agent, and if informative rewards are sparse the agent learns slowly or may not learn at all. To improve exploration…

Machine Learning · Computer Science 2024-11-12 Simone Parisi , Alireza Kazemipour , Michael Bowling

This paper presents a novel form of policy gradient for model-free reinforcement learning (RL) with improved exploration properties. Current policy-based methods use entropy regularization to encourage undirected exploration of the reward…

Machine Learning · Computer Science 2017-03-17 Ofir Nachum , Mohammad Norouzi , Dale Schuurmans

In sparse reward scenarios of reinforcement learning (RL), the memory mechanism provides promising shortcuts to policy optimization by reflecting on past experiences like humans. However, current memory-based RL methods simply store and…

Machine Learning · Computer Science 2026-05-28 Renye Yan , Yaozhong Gan , You Wu , Junliang Xing , Ling Liangn , Yeshang Zhu , Yimao Cai

Reinforcement Learning (RL) has gained substantial attention across diverse application domains and theoretical investigations. Existing literature on RL theory largely focuses on risk-neutral settings where the decision-maker learns to…

Machine Learning · Computer Science 2024-12-24 Zhengqi Wu , Renyuan Xu

Group Relative Policy Optimization(GRPO) has become a cornerstone of modern reinforcement learning alignment, prized for its efficacy in foregoing an explicit value-critic by leveraging reward normalization across sampled trajectory…

Computation and Language · Computer Science 2026-05-29 Redacted by arXiv

We investigate reinforcement learning (RL) for privileged planning in autonomous driving. State-of-the-art approaches for this task are rule-based, but these methods do not scale to the long tail. RL, on the other hand, is scalable and does…

Machine Learning · Computer Science 2025-08-22 Bernhard Jaeger , Daniel Dauner , Jens Beißwenger , Simon Gerstenecker , Kashyap Chitta , Andreas Geiger

State-of-the-art reinforcement learning (RL) algorithms typically use random sampling (e.g., $\epsilon$-greedy) for exploration, but this method fails on hard exploration tasks like Montezuma's Revenge. To address the challenge of…

Machine Learning · Computer Science 2022-11-21 Eric Chen , Zhang-Wei Hong , Joni Pajarinen , Pulkit Agrawal
‹ Prev 1 2 3 10 Next ›