Related papers: Exploratory Diffusion Model for Unsupervised Reinf…

Towards Controllable Diffusion Models via Reward-Guided Exploration

By formulating data samples' formation as a Markov denoising process, diffusion models achieve state-of-the-art performances in a collection of tasks. Recently, many variants of diffusion models have been proposed to enable controlled…

Machine Learning · Computer Science 2023-04-17 Hengtong Zhang , Tingyang Xu

Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review

This tutorial provides a comprehensive survey of methods for fine-tuning diffusion models to optimize downstream reward functions. While diffusion models are widely known to provide excellent generative modeling capability, practical…

Machine Learning · Computer Science 2024-07-19 Masatoshi Uehara , Yulai Zhao , Tommaso Biancalani , Sergey Levine

Exploration by Random Distribution Distillation

Exploration remains a critical challenge in online reinforcement learning, as an agent must effectively explore unknown environments to achieve high returns. Currently, the main exploration algorithms are primarily count-based methods and…

Machine Learning · Computer Science 2025-05-19 Zhirui Fang , Kai Yang , Jian Tao , Jiafei Lyu , Lusong Li , Li Shen , Xiu Li

EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model

Unsupervised reinforcement learning (URL) poses a promising paradigm to learn useful behaviors in a task-agnostic environment without the guidance of extrinsic rewards to facilitate the fast adaptation of various downstream tasks. Previous…

Machine Learning · Computer Science 2023-02-23 Yifu Yuan , Jianye Hao , Fei Ni , Yao Mu , Yan Zheng , Yujing Hu , Jinyi Liu , Yingfeng Chen , Changjie Fan

Explore and Control with Adversarial Surprise

Unsupervised reinforcement learning (RL) studies how to leverage environment statistics to learn useful behaviors without the cost of reward engineering. However, a central challenge in unsupervised RL is to extract behaviors that…

Machine Learning · Computer Science 2021-12-30 Arnaud Fickinger , Natasha Jaques , Samyak Parajuli , Michael Chang , Nicholas Rhinehart , Glen Berseth , Stuart Russell , Sergey Levine

Unsupervised Skill Discovery through Skill Regions Differentiation

Unsupervised Reinforcement Learning (RL) aims to discover diverse behaviors that can accelerate the learning of downstream tasks. Previous methods typically focus on entropy-based exploration or empowerment-driven skill learning. However,…

Machine Learning · Computer Science 2025-06-18 Ting Xiao , Jiakun Zheng , Rushuai Yang , Kang Xu , Qiaosheng Zhang , Peng Liu , Chenjia Bai

Feedback Efficient Online Fine-Tuning of Diffusion Models

Diffusion models excel at modeling complex data distributions, including those of images, proteins, and small molecules. However, in many cases, our goal is to model parts of the distribution that maximize certain properties: for example,…

Machine Learning · Computer Science 2024-07-19 Masatoshi Uehara , Yulai Zhao , Kevin Black , Ehsan Hajiramezanali , Gabriele Scalia , Nathaniel Lee Diamant , Alex M Tseng , Sergey Levine , Tommaso Biancalani

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning

Efficient exploration remains a challenging problem in reinforcement learning, especially for tasks where extrinsic rewards from environments are sparse or even totally disregarded. Significant advances based on intrinsic motivation show…

Machine Learning · Computer Science 2024-04-03 Chenjia Bai , Peng Liu , Kaiyu Liu , Lingxiao Wang , Yingnan Zhao , Lei Han

Reward Shaping via Diffusion Process in Reinforcement Learning

Reinforcement Learning (RL) models have continually evolved to navigate the exploration - exploitation trade-off in uncertain Markov Decision Processes (MDPs). In this study, I leverage the principles of stochastic thermodynamics and system…

Machine Learning · Computer Science 2023-06-22 Peeyush Kumar

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Exploration in reinforcement learning is a challenging problem: in the worst case, the agent must search for high-reward states that could be hidden anywhere in the state space. Can we define a more tractable class of RL problems, where the…

Machine Learning · Computer Science 2021-07-20 Kevin Li , Abhishek Gupta , Ashwin Reddy , Vitchyr Pong , Aurick Zhou , Justin Yu , Sergey Levine

Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration

Recent advancements in deep reinforcement learning (RL) have demonstrated notable progress in sample efficiency, spanning both model-based and model-free paradigms. Despite the identification and mitigation of specific bottlenecks in prior…

Machine Learning · Computer Science 2024-04-02 Yibo Wang , Jiang Zhao

DIME:Diffusion-Based Maximum Entropy Reinforcement Learning

Maximum entropy reinforcement learning (MaxEnt-RL) has become the standard approach to RL due to its beneficial exploration properties. Traditionally, policies are parameterized using Gaussian distributions, which significantly limits their…

Machine Learning · Computer Science 2025-06-11 Onur Celik , Zechu Li , Denis Blessing , Ge Li , Daniel Palenicek , Jan Peters , Georgia Chalvatzaki , Gerhard Neumann

Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation

Learning diverse policies for non-prehensile manipulation is essential for improving skill transfer and generalization to out-of-distribution scenarios. In this work, we enhance exploration through a two-fold approach within a hybrid…

Robotics · Computer Science 2025-04-29 Huy Le , Tai Hoang , Miroslav Gabriel , Gerhard Neumann , Ngo Anh Vien

Implicit Generative Modeling for Efficient Exploration

Efficient exploration remains a challenging problem in reinforcement learning, especially for those tasks where rewards from environments are sparse. A commonly used approach for exploring such environments is to introduce some "intrinsic"…

Machine Learning · Computer Science 2020-07-16 Neale Ratzlaff , Qinxun Bai , Li Fuxin , Wei Xu

Never Give Up: Learning Directed Exploration Strategies

We propose a reinforcement learning agent to solve hard exploration games by learning a range of directed exploratory policies. We construct an episodic memory-based intrinsic reward using k-nearest neighbors over the agent's recent…

Machine Learning · Computer Science 2020-02-17 Adrià Puigdomènech Badia , Pablo Sprechmann , Alex Vitvitskyi , Daniel Guo , Bilal Piot , Steven Kapturowski , Olivier Tieleman , Martín Arjovsky , Alexander Pritzel , Andew Bolt , Charles Blundell

Reward-Free Exploration for Reinforcement Learning

Exploration is widely regarded as one of the most challenging aspects of reinforcement learning (RL), with many naive approaches succumbing to exponential sample complexity. To isolate the challenges of exploration, we propose a new…

Machine Learning · Computer Science 2020-02-10 Chi Jin , Akshay Krishnamurthy , Max Simchowitz , Tiancheng Yu

Adaptive Reward-Free Exploration

Reward-free exploration is a reinforcement learning setting studied by Jin et al. (2020), who address it by running several algorithms with regret guarantees in parallel. In our work, we instead give a more natural adaptive approach for…

Machine Learning · Computer Science 2020-10-08 Emilie Kaufmann , Pierre Ménard , Omar Darwiche Domingues , Anders Jonsson , Edouard Leurent , Michal Valko

Unsupervised Learning of Efficient Exploration: Pre-training Adaptive Policies via Self-Imposed Goals

Unsupervised pre-training can equip reinforcement learning agents with prior knowledge and accelerate learning in downstream tasks. A promising direction, grounded in human development, investigates agents that learn by setting and pursuing…

Machine Learning · Computer Science 2026-01-28 Octavio Pappalardo

Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning

Exploration in complex domains is a key challenge in reinforcement learning, especially for tasks with very sparse rewards. Recent successes in deep reinforcement learning have been achieved mostly using simple heuristic exploration…

Machine Learning · Computer Science 2017-03-07 Joshua Achiam , Shankar Sastry

Mixture of Autoencoder Experts Guidance using Unlabeled and Incomplete Data for Exploration in Reinforcement Learning

Recent trends in Reinforcement Learning (RL) highlight the need for agents to learn from reward-free interactions and alternative supervision signals, such as unlabeled or incomplete demonstrations, rather than relying solely on explicit…

Machine Learning · Computer Science 2025-07-22 Elias Malomgré , Pieter Simoens