Related papers: Efficient Exploration for LLMs

Application of LLMs to Multi-Robot Path Planning and Task Allocation

Efficient exploration is a well known problem in deep reinforcement learning and this problem is exacerbated in multi-agent reinforcement learning due the intrinsic complexities of such algorithms. There are several approaches to…

Artificial Intelligence · Computer Science 2025-07-11 Ashish Kumar

Toward Efficient Exploration by Large Language Model Agents

A burgeoning area within reinforcement learning (RL) is the design of sequential decision-making agents centered around large language models (LLMs). While autonomous decision-making agents powered by modern LLMs could facilitate numerous…

Machine Learning · Computer Science 2026-02-10 Dilip Arumugam , Thomas L. Griffiths

Efficient Exploration at Scale

We develop an online learning algorithm that dramatically improves the data efficiency of reinforcement learning from human feedback (RLHF). Our algorithm incrementally updates reward and language models as choice data is received. The…

Machine Learning · Computer Science 2026-03-19 Seyed Mohammad Asghari , Chris Chute , Vikranth Dwaracherla , Xiuyuan Lu , Mehdi Jafarnia , Victor Minden , Zheng Wen , Benjamin Van Roy

Model-Based Active Exploration

Efficient exploration is an unsolved problem in Reinforcement Learning which is usually addressed by reactively rewarding the agent for fortuitously encountering novel situations. This paper introduces an efficient active exploration…

Machine Learning · Computer Science 2019-06-17 Pranav Shyam , Wojciech Jaśkowski , Faustino Gomez

Guiding Pretraining in Reinforcement Learning with Large Language Models

Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped reward function. Intrinsically motivated exploration methods address this limitation by rewarding agents for visiting novel states or transitions,…

Machine Learning · Computer Science 2023-09-18 Yuqing Du , Olivia Watkins , Zihan Wang , Cédric Colas , Trevor Darrell , Pieter Abbeel , Abhishek Gupta , Jacob Andreas

Implicit Generative Modeling for Efficient Exploration

Efficient exploration remains a challenging problem in reinforcement learning, especially for those tasks where rewards from environments are sparse. A commonly used approach for exploring such environments is to introduce some "intrinsic"…

Machine Learning · Computer Science 2020-07-16 Neale Ratzlaff , Qinxun Bai , Li Fuxin , Wei Xu

Query-Efficient Planning with Language Models

Planning in complex environments requires an agent to efficiently query a world model to find a feasible sequence of actions from start to goal. Recent work has shown that Large Language Models (LLMs), with their rich prior knowledge and…

Artificial Intelligence · Computer Science 2024-12-10 Gonzalo Gonzalez-Pumariega , Wayne Chen , Kushal Kedia , Sanjiban Choudhury

Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration

With expansive state-action spaces, efficient multi-agent exploration remains a longstanding challenge in reinforcement learning. Although pursuing novelty, diversity, or uncertainty attracts increasing attention, redundant efforts brought…

Artificial Intelligence · Computer Science 2024-10-04 Yun Qu , Boyuan Wang , Yuhang Jiang , Jianzhun Shao , Yixiu Mao , Cheems Wang , Chang Liu , Xiangyang Ji

Disentangling Exploration of Large Language Models by Optimal Exploitation

Exploration is a crucial skill for in-context reinforcement learning in unknown environments. However, it remains unclear if large language models can effectively explore a partially hidden state space. This work isolates exploration as the…

Machine Learning · Computer Science 2025-08-26 Tim Grams , Patrick Betz , Sascha Marton , Stefan Lüdtke , Christian Bartelt

Exploration enhances cooperation in the multi-agent communication system

Designing protocols enhancing cooperation for multi-agent systems remains a grand challenge. Cheap talk, defined as costless, non-binding communication before formal action, serves as a pivotal solution. However, existing theoretical…

Multiagent Systems · Computer Science 2026-03-03 Zhao Song , Chen Shen , Zhen Wang , The Anh Han

Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration

Language model alignment (or, reinforcement learning) techniques that leverage active exploration -- deliberately encouraging the model to produce diverse, informative responses -- offer the promise of super-human capabilities. However,…

Machine Learning · Computer Science 2025-03-17 Dylan J. Foster , Zakaria Mhammedi , Dhruv Rohatgi

Adaptive Reasoning Executor: A Collaborative Agent System for Efficient Reasoning

Recent advances in Large Language Models (LLMs) demonstrate that chain-of-thought prompting and deep reasoning substantially enhance performance on complex tasks, and multi-agent systems can further improve accuracy by enabling model…

Artificial Intelligence · Computer Science 2025-10-16 Zehui Ling , Deshu Chen , Yichi Zhang , Yuchen Liu , Xigui Li , Xin Guo , Yuan Cheng

Strategically Efficient Exploration in Competitive Multi-agent Reinforcement Learning

High sample complexity remains a barrier to the application of reinforcement learning (RL), particularly in multi-agent systems. A large body of work has demonstrated that exploration mechanisms based on the principle of optimism under…

Machine Learning · Computer Science 2021-08-02 Robert Loftin , Aadirupa Saha , Sam Devlin , Katja Hofmann

Learning to Explore: Scaling Agentic Reasoning via Exploration-Aware Policy Optimization

Recent advancements in agentic test-time scaling allow models to gather environmental feedback before committing to final actions. A key limitation of existing methods is that they typically employ undifferentiated exploration strategies,…

Artificial Intelligence · Computer Science 2026-05-13 Xingyuan Hua , Sheng Yue , Ju Ren

Rethinking the Role of Entropy in Optimizing Tool-Use Behaviors for Large Language Model Agents

Tool-using agents based on Large Language Models (LLMs) excel in tasks such as mathematical reasoning and multi-hop question answering. However, in long trajectories, agents often trigger excessive and low-quality tool calls, increasing…

Artificial Intelligence · Computer Science 2026-03-25 Zeping Li , Hongru Wang , Yiwen Zhao , Guanhua Chen , Yixia Li , Keyang Chen , Yixin Cao , Guangnan Ye , Hongfeng Chai , Zhenfei Yin

Sample Efficient Preference Alignment in LLMs via Active Exploration

Preference-based feedback is important for many applications in machine learning where evaluation of a reward function is not feasible. Notable recent examples arise in preference alignment for large language models, including in…

Machine Learning · Computer Science 2025-03-21 Viraj Mehta , Syrine Belakaria , Vikramjeet Das , Ojash Neopane , Yijia Dai , Ilija Bogunovic , Barbara Engelhardt , Stefano Ermon , Jeff Schneider , Willie Neiswanger

QExplorer: Large Language Model Based Query Extraction for Toxic Content Exploration

Automatically extracting effective queries is challenging in information retrieval, especially in toxic content exploration, as such content is likely to be disguised. With the recent achievements in generative Large Language Model (LLM),…

Information Retrieval · Computer Science 2025-02-27 Shaola Ren , Li Ke , Longtao Huang , Dehong Gao , Hui Xue

Efficient Personalization of Generative Models via Optimal Experimental Design

Preference learning from human feedback has the ability to align generative models with the needs of end-users. Human feedback is costly and time-consuming to obtain, which creates demand for data-efficient query selection methods. This…

Machine Learning · Computer Science 2026-02-18 Guy Schacht , Ziyad Sheebaelhamd , Riccardo De Santi , Mojmír Mutný , Andreas Krause

User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems

Exploration, the act of broadening user experiences beyond their established preferences, is challenging in large-scale recommendation systems due to feedback loops and limited signals on user exploration patterns. Large Language Models…

Information Retrieval · Computer Science 2025-05-29 Jianling Wang , Yifan Liu , Yinghao Sun , Xuejian Ma , Yueqi Wang , He Ma , Zhengyang Su , Minmin Chen , Mingyan Gao , Onkar Dalal , Ed H. Chi , Lichan Hong , Ningren Han , Haokai Lu

Should You Use Your Large Language Model to Explore or Exploit?

We evaluate the ability of the current generation of large language models (LLMs) to help a decision-making agent facing an exploration-exploitation tradeoff. While previous work has largely study the ability of LLMs to solve combined…

Machine Learning · Computer Science 2026-02-18 Keegan Harris , Aleksandrs Slivkins