Related papers: Sample-Efficient Model-based Actor-Critic for an I…

Sample-Efficient Reinforcement Learning via Conservative Model-Based Actor-Critic

Model-based reinforcement learning algorithms, which aim to learn a model of the environment to make decisions, are more sample efficient than their model-free counterparts. The sample efficiency of model-based approaches relies on whether…

Machine Learning · Computer Science 2021-12-21 Zhihai Wang , Jie Wang , Qi Zhou , Bin Li , Houqiang Li

A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning

The aim of multi-task reinforcement learning is two-fold: (1) efficiently learn by training against multiple tasks and (2) quickly adapt, using limited samples, to a variety of new tasks. In this work, the tasks correspond to reward…

Machine Learning · Computer Science 2019-11-05 Nicholas C. Landolfi , Garrett Thomas , Tengyu Ma

Sample Efficient Deep Reinforcement Learning for Dialogue Systems with Large Action Spaces

In spoken dialogue systems, we aim to deploy artificial intelligence to build automated dialogue agents that can converse with humans. A part of this effort is the policy optimisation task, which attempts to find a policy describing how to…

Computation and Language · Computer Science 2018-02-13 Gellért Weisz , Paweł Budzianowski , Pei-Hao Su , Milica Gašić

Efficient Dialog Policy Learning via Positive Memory Retention

This paper is concerned with the training of recurrent neural networks as goal-oriented dialog agents using reinforcement learning. Training such agents with policy gradients typically requires a large amount of samples. However, the…

Artificial Intelligence · Computer Science 2020-05-26 Rui Zhao , Volker Tresp

Model-based Policy Optimization using Symbolic World Model

The application of learning-based control methods in robotics presents significant challenges. One is that model-free reinforcement learning algorithms use observation data with low sample efficiency. To address this challenge, a prevalent…

Machine Learning · Computer Science 2024-07-19 Andrey Gorodetskiy , Konstantin Mironov , Aleksandr Panov

Learning Powerful Policies by Using Consistent Dynamics Model

Model-based Reinforcement Learning approaches have the promise of being sample efficient. Much of the progress in learning dynamics models in RL has been made by learning models via supervised learning. But traditional model-based…

Machine Learning · Computer Science 2019-06-12 Shagun Sodhani , Anirudh Goyal , Tristan Deleu , Yoshua Bengio , Sergey Levine , Jian Tang

Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management

Deep reinforcement learning (RL) methods have significant potential for dialogue policy optimisation. However, they suffer from a poor performance in the early stages of learning. This is especially problematic for on-line learning with…

Computation and Language · Computer Science 2017-07-06 Pei-Hao Su , Pawel Budzianowski , Stefan Ultes , Milica Gasic , Steve Young

On the model-based stochastic value gradient for continuous reinforcement learning

For over a decade, model-based reinforcement learning has been seen as a way to leverage control-based domain knowledge to improve the sample-efficiency of reinforcement learning agents. While model-based agents are conceptually appealing,…

Machine Learning · Computer Science 2021-05-28 Brandon Amos , Samuel Stanton , Denis Yarats , Andrew Gordon Wilson

Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition

Many studies have applied reinforcement learning to train a dialog policy and show great promise these years. One common approach is to employ a user simulator to obtain a large number of simulated user experiences for reinforcement…

Computation and Language · Computer Science 2020-04-24 Ryuichi Takanobu , Runze Liang , Minlie Huang

Reinforcement learning with world model

Nowadays, model-free reinforcement learning algorithms have achieved remarkable performance on many decision making and control tasks, but high sample complexity and low sample efficiency still hinder the wide use of model-free…

Artificial Intelligence · Computer Science 2020-10-27 Jingbin Liu , Xinyang Gu , Shuai Liu

Efficient Model-Based Reinforcement Learning for Robot Control via Online Optimization

We present an online model-based reinforcement learning algorithm suitable for controlling complex robotic systems directly in the real world. Unlike prevailing sim-to-real pipelines that rely on extensive offline simulation and model-free…

Robotics · Computer Science 2026-05-07 Fang Nan , Hao Ma , Qinghua Guan , Josie Hughes , Michael Muehlebach , Marco Hutter

Sample-efficient Deep Reinforcement Learning with Imaginary Rollouts for Human-Robot Interaction

Deep reinforcement learning has proven to be a great success in allowing agents to learn complex tasks. However, its application to actual robots can be prohibitively expensive. Furthermore, the unpredictability of human behavior in…

Robotics · Computer Science 2019-08-16 Mohammad Thabet , Massimiliano Patacchiola , Angelo Cangelosi

Model-Based Data-Efficient and Robust Reinforcement Learning

A data-efficient learning-based control design method is proposed in this paper. It is based on learning a system dynamics model that is then leveraged in a two-level procedure. On the higher level, a simple but powerful optimization…

Systems and Control · Electrical Eng. & Systems 2026-02-03 Ludvig Svedlund , Constantin Cronrath , Jonas Fredriksson , Bengt Lennartson

Improving Proactive Dialog Agents Using Socially-Aware Reinforcement Learning

The next step for intelligent dialog agents is to escape their role as silent bystanders and become proactive. Well-defined proactive behavior may improve human-machine cooperation, as the agent takes a more active role during interaction…

Computation and Language · Computer Science 2023-06-23 Matthias Kraus , Nicolas Wagner , Ron Riekenbrauck , Wolfgang Minker

Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning

Communicating in natural language is a powerful tool in multi-agent settings, as it enables independent agents to share information in partially observable settings and allows zero-shot coordination with humans. However, most prior works…

Artificial Intelligence · Computer Science 2025-02-11 Bidipta Sarkar , Warren Xia , C. Karen Liu , Dorsa Sadigh

Relational Object-Centric Actor-Critic

The advances in unsupervised object-centric representation learning have significantly improved its application to downstream tasks. Recent works highlight that disentangled object representations can aid policy learning in image-based,…

Artificial Intelligence · Computer Science 2025-03-21 Leonid Ugadiarov , Vitaliy Vorobyov , Aleksandr I. Panov

Adaptive Human-Computer Interaction Strategies Through Reinforcement Learning in Complex

This study addresses the challenges of dynamics and complexity in intelligent human-computer interaction and proposes a reinforcement learning-based optimization framework to improve long-term returns and overall experience. Human-computer…

Human-Computer Interaction · Computer Science 2025-11-03 Rui Liu , Yifan Zhuang , Runsheng Zhang

Model-free Reinforcement Learning for Model-based Control: Towards Safe, Interpretable and Sample-efficient Agents

Training sophisticated agents for optimal decision-making under uncertainty has been key to the rapid development of modern autonomous systems across fields. Notably, model-free reinforcement learning (RL) has enabled decision-making agents…

Machine Learning · Computer Science 2025-07-21 Thomas Banker , Ali Mesbah

Improving Interaction Quality Estimation with BiLSTMs and the Impact on Dialogue Policy Learning

Learning suitable and well-performing dialogue behaviour in statistical spoken dialogue systems has been in the focus of research for many years. While most work which is based on reinforcement learning employs an objective measure like…

Computation and Language · Computer Science 2020-01-22 Stefan Ultes

Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems

Dialogue policy learning for task-oriented dialogue systems has enjoyed great progress recently mostly through employing reinforcement learning methods. However, these approaches have become very sophisticated. It is time to re-evaluate it.…

Computation and Language · Computer Science 2020-09-22 Ziming Li , Julia Kiseleva , Maarten de Rijke