Related papers: Randomized Value Functions via Multiplicative Norm…

Randomized Policy Learning for Continuous State and Action MDPs

Deep reinforcement learning methods have achieved state-of-the-art results in a variety of challenging, high-dimensional domains ranging from video games to locomotion. The key to success has been the use of deep neural networks used to…

Machine Learning · Computer Science 2020-11-17 Hiteshi Sharma , Rahul Jain

Deep Exploration via Randomized Value Functions

We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to…

Machine Learning · Statistics 2019-09-25 Ian Osband , Benjamin Van Roy , Daniel Russo , Zheng Wen

Accelerating Policy Gradient by Estimating Value Function from Prior Computation in Deep Reinforcement Learning

This paper investigates the use of prior computation to estimate the value function to improve sample efficiency in on-policy policy gradient methods in reinforcement learning. Our approach is to estimate the value function from prior…

Machine Learning · Computer Science 2023-02-06 Md Masudur Rahman , Yexiang Xue

Efficient Exploration through Bayesian Deep Q-Networks

We study reinforcement learning (RL) in high dimensional episodic Markov decision processes (MDP). We consider value-based RL when the optimal Q-value is a linear function of d-dimensional state-action feature representation. For instance,…

Artificial Intelligence · Computer Science 2019-09-10 Kamyar Azizzadenesheli , Animashree Anandkumar

Improving Deep Policy Gradients with Value Function Search

Deep Policy Gradient (PG) algorithms employ value networks to drive the learning of parameterized policies and reduce the variance of the gradient estimates. However, value function approximation gets stuck in local optima and struggles to…

Machine Learning · Computer Science 2023-02-21 Enrico Marchesini , Christopher Amato

Variational Mixture of Normalizing Flows

In the past few years, deep generative models, such as generative adversarial networks \autocite{GAN}, variational autoencoders \autocite{vaepaper}, and their variants, have seen wide adoption for the task of modelling complex data…

Machine Learning · Statistics 2020-09-02 Guilherme G. P. Freitas Pires , Mário A. T. Figueiredo

Convolutional Normalizing Flows for Deep Gaussian Processes

Deep Gaussian processes (DGPs), a hierarchical composition of GP models, have successfully boosted the expressive power of their single-layer counterpart. However, it is impossible to perform exact inference in DGPs, which has motivated the…

Machine Learning · Computer Science 2021-05-27 Haibin Yu , Dapeng Liu , Yizhou Chen , Bryan Kian Hsiang Low , Patrick Jaillet

Rethinking Value Function Learning for Generalization in Reinforcement Learning

Our work focuses on training RL agents on multiple visually diverse environments to improve observational generalization performance. In prior methods, policy and value networks are separately optimized using a disjoint network architecture…

Machine Learning · Computer Science 2023-01-10 Seungyong Moon , JunYeong Lee , Hyun Oh Song

Deterministic Value-Policy Gradients

Reinforcement learning algorithms such as the deep deterministic policy gradient algorithm (DDPG) has been widely used in continuous control tasks. However, the model-free DDPG algorithm suffers from high sample complexity. In this paper we…

Machine Learning · Computer Science 2019-11-14 Qingpeng Cai , Ling Pan , Pingzhong Tang

Deep Normalizing Flows for State Estimation

Safe and reliable state estimation techniques are a critical component of next-generation robotic systems. Agents in such systems must be able to reason about the intentions and trajectories of other agents for safe and efficient motion…

Robotics · Computer Science 2023-06-28 Harrison Delecki , Liam A. Kruse , Marc R. Schlichting , Mykel J. Kochenderfer

Normality-Guided Distributional Reinforcement Learning for Continuous Control

Learning a predictive model of the mean return, or value function, plays a critical role in many reinforcement learning algorithms. Distributional reinforcement learning (DRL) has been shown to improve performance by modeling the value…

Machine Learning · Computer Science 2025-07-08 Ju-Seung Byun , Andrew Perrault

Active Importance Sampling for Variational Objectives Dominated by Rare Events: Consequences for Optimization and Generalization

Deep neural networks, when optimized with sufficient data, provide accurate representations of high-dimensional functions; in contrast, function approximation techniques that have predominated in scientific computing do not scale well with…

Data Analysis, Statistics and Probability · Physics 2021-03-15 Grant M. Rotskoff , Andrew R. Mitchell , Eric Vanden-Eijnden

Multiplicative Normalizing Flows for Variational Bayesian Neural Networks

We reinterpret multiplicative noise in neural networks as auxiliary random variables that augment the approximate posterior in a variational setting for Bayesian neural networks. We show that through this interpretation it is both efficient…

Machine Learning · Statistics 2017-06-14 Christos Louizos , Max Welling

Value Function Decomposition in Markov Recommendation Process

Recent advances in recommender systems have shown that user-system interaction essentially formulates long-term optimization problems, and online reinforcement learning can be adopted to improve recommendation performance. The general…

Information Retrieval · Computer Science 2025-02-04 Xiaobei Wang , Shuchang Liu , Qingpeng Cai , Xiang Li , Lantao Hu , Han li , Guangming Xie

PaddingFlow: Improving Normalizing Flows with Padding-Dimensional Noise

Normalizing flow is a generative modeling approach with efficient sampling. However, Flow-based models suffer two issues: 1) If the target distribution is manifold, due to the unmatch between the dimensions of the latent target distribution…

Machine Learning · Computer Science 2024-04-24 Qinglong Meng , Chongkun Xia , Xueqian Wang

Flow-based Domain Randomization for Learning and Sequencing Robotic Skills

Domain randomization in reinforcement learning is an established technique for increasing the robustness of control policies trained in simulation. By randomizing environment properties during training, the learned policy can become robust…

Robotics · Computer Science 2025-05-07 Aidan Curtis , Eric Li , Michael Noseworthy , Nishad Gothoskar , Sachin Chitta , Hui Li , Leslie Pack Kaelbling , Nicole Carey

DNA: Proximal Policy Optimization with a Dual Network Architecture

This paper explores the problem of simultaneously learning a value function and policy in deep actor-critic reinforcement learning models. We find that the common practice of learning these functions jointly is sub-optimal, due to an…

Machine Learning · Computer Science 2022-11-15 Matthew Aitchison , Penny Sweetser

Value function estimation using conditional diffusion models for control

A fairly reliable trend in deep reinforcement learning is that the performance scales with the number of parameters, provided a complimentary scaling in amount of training data. As the appetite for large models increases, it is imperative…

Machine Learning · Computer Science 2023-06-14 Bogdan Mazoure , Walter Talbott , Miguel Angel Bautista , Devon Hjelm , Alexander Toshev , Josh Susskind

Discrete Sequential Prediction of Continuous Actions for Deep RL

It has long been assumed that high dimensional continuous control problems cannot be solved effectively by discretizing individual dimensions of the action space due to the exponentially large number of bins over which policies would have…

Machine Learning · Computer Science 2019-06-11 Luke Metz , Julian Ibarz , Navdeep Jaitly , James Davidson

REValueD: Regularised Ensemble Value-Decomposition for Factorisable Markov Decision Processes

Discrete-action reinforcement learning algorithms often falter in tasks with high-dimensional discrete action spaces due to the vast number of possible actions. A recent advancement leverages value-decomposition, a concept from multi-agent…

Machine Learning · Computer Science 2024-03-11 David Ireland , Giovanni Montana