Related papers: Entropy-Guided Control Improvisation
Analyzing and controlling system entropy is a powerful tool for regulating predictability of control systems. Applications benefiting from such approaches range from reinforcement learning and data security to human-robot collaboration. In…
We propose novel controller synthesis techniques for probabilistic systems modelled using stochastic two-player games: one player acts as a controller, the second represents its environment, and probability is used to capture uncertainty…
In many synthesis problems, it can be essential to generate implementations which not only satisfy functional constraints but are also randomized to improve variety, robustness, or unpredictability. The recently-proposed framework of…
Sample-based trajectory optimisers are a promising tool for the control of robotics with non-differentiable dynamics and cost functions. Contemporary approaches derive from a restricted subclass of stochastic optimal control where the…
Regularization of control policies using entropy can be instrumental in adjusting predictability of real-world systems. Applications benefiting from such approaches range from, e.g., cybersecurity, which aims at maximal unpredictability, to…
Reactive synthesis is a paradigm for automatically building correct-by-construction systems that interact with an unknown or adversarial environment. We study how to do reactive synthesis when part of the specification of the system is that…
Having a perfect model to compute the optimal policy is often infeasible in reinforcement learning. It is important in high-stakes domains to quantify and manage risk induced by model uncertainties. Entropic risk measure is an exponential…
Entropy regularization is commonly used to improve policy optimization in reinforcement learning. It is believed to help with \emph{exploration} by encouraging the selection of more stochastic policies. In this work, we analyze this claim…
Entropy regularized algorithms such as Soft Q-learning and Soft Actor-Critic, recently showed state-of-the-art performance on a number of challenging reinforcement learning (RL) tasks. The regularized formulation modifies the standard RL…
Capturing uncertainty in models of complex dynamical systems is crucial to designing safe controllers. Stochastic noise causes aleatoric uncertainty, whereas imprecise knowledge of model parameters leads to epistemic uncertainty. Several…
Entropic risk (ERisk) is an established risk measure in finance, quantifying risk by an exponential re-weighting of rewards. We study ERisk for the first time in the context of turn-based stochastic games with the total reward objective.…
Model-based planners for partially observable problems must accommodate both model uncertainty during planning and goal uncertainty during objective inference. However, model-based planners may be brittle under these types of uncertainty…
This work demonstrates how the concept of the entropic potential of events -- a parameter quantifying the influence of discrete events on the expected future entropy of a system -- can enhance uncertainty quantification, decision-making,…
The entropy regularization is inspired by information entropy from machine learning and the ideas of exploration and exploitation in reinforcement learning, which appears in the control problem to design an approximating algorithm for the…
We introduce Implicit Policy, a general class of expressive policies that can flexibly represent complex action distributions in reinforcement learning, with efficient algorithms to compute entropy regularized policy gradients. We…
The framework of reinforcement learning or optimal control provides a mathematical formalization of intelligent decision making that is powerful and broadly applicable. While the general form of the reinforcement learning problem enables…
Making decisions freely presupposes that there is some indeterminacy in the environment and in the decision making engine. The former is reflected on the behavioral changes due to communicating: few changes indicate rigid environments;…
Computers are deterministic dynamical systems (CHAOS 19:033124, 2009). Among other things, that implies that one should be able to use deterministic forecast rules to predict their behavior. That statement is sometimes-but not always-true.…
Stochastic and soft optimal policies resulting from entropy-regularized Markov decision processes (ER-MDP) are desirable for exploration and imitation learning applications. Motivated by the fact that such policies are sensitive with…
Discrete-time stochastic systems are an essential modelling tool for many engineering systems. We consider stochastic control systems that are evolving over continuous spaces. For this class of models, methods for the formal verification…