English
Related papers

Related papers: WALL-E: An Efficient Reinforcement Learning Resear…

200 papers

Reinforcement learning (RL) has become a pivotal component of large language model (LLM) post-training, and agentic RL extends this paradigm to operate as agents through multi-turn interaction and tool use. Scaling such systems exposes two…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-08 Zheyue Tan , Mustapha Abdullahi , Tuo Shi , Huining Yuan , Zelai Xu , Chao Yu , Boxun Li , Bo Zhao

Improving sample efficiency is central to Reinforcement Learning (RL), especially in environments where the rewards are sparse. Some recent approaches have proposed to specify reward functions as manually designed or learned reward…

Machine Learning · Computer Science 2024-01-26 Shuai Han , Mehdi Dastani , Shihan Wang

Reinforcement learning (RL) is a sub-domain of machine learning, mainly concerned with solving sequential decision-making problems by a learning agent that interacts with the decision environment to improve its behavior through the reward…

Machine Learning · Computer Science 2025-09-23 Hossein Hassani , Ehsan Hallaji , Roozbeh Razavi-Far , Mehrdad Saif , Liang Lin

The promotion of large-scale applications of reinforcement learning (RL) requires efficient training computation. While existing parallel RL frameworks encompass a variety of RL algorithms and parallelization techniques, the excessively…

Machine Learning · Computer Science 2023-12-12 Jing Hou , Guang Chen , Ruiqi Zhang , Zhijun Li , Shangding Gu , Changjun Jiang

Scaling reinforcement learning (RL) has shown strong promise for enhancing the reasoning abilities of large language models (LLMs), particularly in tasks requiring long chain-of-thought generation. However, RL training efficiency is often…

Machine Learning · Computer Science 2026-03-25 Yiqi Zhang , Huiqiang Jiang , Xufang Luo , Zhihe Yang , Chengruidong Zhang , Yifei Shen , Dongsheng Li , Yuqing Yang , Lili Qiu , Yang You

Test-Time Scaling enhances the reasoning capabilities of Large Language Models by allocating additional inference compute to broaden the exploration of the solution space. However, existing search strategies typically treat rollouts as…

Computation and Language · Computer Science 2026-05-06 Xinglin Wang , Jiayi Shi , Shaoxiong Feng , Peiwen Yuan , Yiwei Li , Yueqi Zhang , Chuyi Tan , Ji Zhang , Boyuan Pan , Yao Hu , Kan Li

Meta-reinforcement learning (meta-RL) aims to quickly solve new tasks by leveraging knowledge from prior tasks. However, previous studies often assume a single mode homogeneous task distribution, ignoring possible structured heterogeneity…

Machine Learning · Computer Science 2023-02-17 Zhendong Chu , Hongning Wang

Group-relative RL training (GRPO) samples a small group of parallel rollouts for every training prompt and uses their within-group reward spread to compute per-trajectory advantages. In agentic environments each rollout is a long multi-turn…

Machine Learning · Computer Science 2026-05-08 Zhiyuan Zhai , Xin Wang

Reinforcement Learning (RL) algorithms can suffer from poor sample efficiency when rewards are delayed and sparse. We introduce a solution that enables agents to learn temporally extended actions at multiple levels of abstraction in a…

Machine Learning · Computer Science 2019-03-11 Andrew Levy , Robert Platt , Kate Saenko

We introduce ROLL, an efficient, scalable, and user-friendly library designed for Reinforcement Learning Optimization for Large-scale Learning. ROLL caters to three primary user groups: tech pioneers aiming for cost-effective,…

Reinforcement learning (RL) is a critical stage in post-training large language models (LLMs), involving repeated interaction between rollout generation, reward evaluation, and centralized learning. Distributing rollout execution offers…

Reinforcement learning (RL) has emerged as an effective post-training paradigm for enhancing the reasoning capabilities of multimodal large language model (MLLM). However, current RL pipelines often suffer from training inefficiencies…

Machine Learning · Computer Science 2026-03-04 Linghao Zhu , Yiran Guan , Dingkang Liang , Jianzhong Ju , Zhenbo Luo , Bin Qin , Jian Luan , Yuliang Liu , Xiang Bai

Reinforcement learning (RL) systems have countless applications, from energy-grid management to protein design. However, such real-world scenarios are often extremely difficult, combinatorial in nature, and require complex coordination…

Parallel thinking has emerged as a novel approach for enhancing the reasoning capabilities of large language models (LLMs) by exploring multiple reasoning paths concurrently. However, activating such capabilities through training remains…

Computation and Language · Computer Science 2025-09-15 Tong Zheng , Hongming Zhang , Wenhao Yu , Xiaoyang Wang , Runpeng Dai , Rui Liu , Huiwen Bao , Chengsong Huang , Heng Huang , Dong Yu

Existing agents for solving tasks such as ML engineering rely on prompting powerful language models. As a result, these agents do not improve with more experience. In this paper, we show that agents backed by weaker models that improve via…

Machine Learning · Computer Science 2025-09-04 Sherry Yang , Joy He-Yueya , Percy Liang

Reinforcement Learning (RL) is a method for learning decision-making tasks that could enable robots to learn and adapt to their situation on-line. For an RL algorithm to be practical for robotic control tasks, it must learn in very few…

Artificial Intelligence · Computer Science 2015-03-19 Todd Hester , Michael Quinlan , Peter Stone

Safe reinforcement learning (RL) is crucial for deploying RL agents in real-world applications, as it aims to maximize long-term rewards while satisfying safety constraints. However, safe RL often suffers from sample inefficiency, requiring…

Machine Learning · Computer Science 2024-06-03 Shangding Gu , Laixi Shi , Yuhao Ding , Alois Knoll , Costas Spanos , Adam Wierman , Ming Jin

This paper considers a class of reinforcement learning problems, which involve systems with two types of states: stochastic and pseudo-stochastic. In such systems, stochastic states follow a stochastic transition kernel while the…

Machine Learning · Computer Science 2023-11-09 Honghao Wei , Xin Liu , Weina Wang , Lei Ying

Reinforcement learning (RL) is crucial for data science decision-making but suffers from sample inefficiency, particularly in real-world scenarios with costly physical interactions. This paper introduces a novel human-inspired framework to…

Machine Learning · Computer Science 2024-03-13 Ali Beikmohammadi , Sindri Magnússon

Executing workflows on volunteer computing resources where individual tasks may be forced to relinquish their resource for the resource's primary use leads to unpredictability and often significantly increases execution time. Task…

Performance · Computer Science 2022-09-28 Andrew Stephen McGough , Matthew Forshaw
‹ Prev 1 2 3 10 Next ›