Related papers: Learning from Demonstrations using Signal Temporal…
Apprenticeship learning crucially depends on effectively learning rewards, and hence control policies from user demonstrations. Of particular difficulty is the setting where the desired task consists of a number of sub-goals with temporal…
Techniques based on Reinforcement Learning (RL) are increasingly being used to design control policies for robotic systems. RL fundamentally relies on state-based reward functions to encode desired behavior of the robot and bad reward…
Tasks with complex temporal structures and long horizons pose a challenge for reinforcement learning agents due to the difficulty in specifying the tasks in terms of reward functions as well as large variances in the learning signals. We…
Reinforcement learning (RL) is a promising approach. However, success is limited to real-world applications, because ensuring safe exploration and facilitating adequate exploitation is a challenge for controlling robotic systems with…
In many settings (e.g., robotics) demonstrations provide a natural way to specify tasks; however, most methods for learning from demonstrations either do not provide guarantees that the artifacts learned for the tasks, such as rewards or…
Most current methods for learning from demonstrations assume that those demonstrations alone are sufficient to learn the underlying task. This is often untrue, especially if extra safety specifications exist which were not present in the…
Reinforcement learning (RL) depends critically on the choice of reward functions used to capture the de- sired behavior and constraints of a robot. Usually, these are handcrafted by a expert designer and represent heuristics for relatively…
Temporal logic rules are often used in control and robotics to provide structured, human-interpretable descriptions of trajectory data. These rules have numerous applications including safety validation using formal methods, constraining…
Using reinforcement learning to learn control policies is a challenge when the task is complex with potentially long horizons. Ensuring adequate but safe exploration is also crucial for controlling physical systems. In this paper, we use…
We consider the problem of steering a system with unknown, stochastic dynamics to satisfy a rich, temporally layered task given as a signal temporal logic formula. We represent the system as a Markov decision process in which the states are…
Learning from demonstration (LfD) has succeeded in tasks featuring a long time horizon. However, when the problem complexity also includes human-in-the-loop perturbations, state-of-the-art approaches do not guarantee the successful…
Learning robot control policies from demonstrations is a powerful paradigm, yet real-world data is often suboptimal, noisy, or otherwise imperfect, posing significant challenges for imitation and reinforcement learning. In this work, we…
Endowed with higher levels of autonomy, robots are required to perform increasingly complex manipulation tasks. Learning from demonstration is arising as a promising paradigm for transferring skills to robots. It allows to implicitly learn…
We propose a Reinforcement Learning (RL) based control design framework for handling complex tasks. The approach extends the concept of Reward Machines (RM) with Signal Temporal Logic (STL) formulas that can be used for event generation.…
In the learning from demonstration (LfD) paradigm, understanding and evaluating the demonstrated behaviors plays a critical role in extracting control policies for robots. Without this knowledge, a robot may infer incorrect reward functions…
Although reinforcement learning has seen tremendous success recently, this kind of trial-and-error learning can be impractical or inefficient in complex environments. The use of demonstrations, on the other hand, enables agents to benefit…
Providing a suitable reward function to reinforcement learning can be difficult in many real world applications. While inverse reinforcement learning (IRL) holds promise for automatically learning reward functions from demonstrations,…
For performing robotic manipulation tasks, the core problem is determining suitable trajectories that fulfill the task requirements. Various approaches to compute such trajectories exist, being learning and optimization the main driving…
In this paper, we address the discovery of robotic options from demonstrations in an unsupervised manner. Specifically, we present a framework to jointly learn low-level control policies and higher-level policies of how to use them from…
The successes of reinforcement learning in recent years are underpinned by the characterization of suitable reward functions. However, in settings where such rewards are non-intuitive, difficult to define, or otherwise error-prone in their…