Related papers: Learning Markov Decision Processes for Model Check…
Many automated system analysis techniques (e.g., model checking, model-based testing) rely on first obtaining a model of the system under analysis. System modeling is often done manually, which is often considered as a hindrance to adopt…
Automata learning techniques automatically generate system models from test observations. These techniques usually fall into two categories: passive and active. Passive learning uses a predetermined data set, e.g., system logs. In contrast,…
This paper proposes to use probabilistic model checking to synthesize optimal robot policies in multi-tasking autonomous systems that are subject to human-robot interaction. Given the convincing empirical evidence that human behavior can be…
Cyber-physical systems (CPSs) are naturally modelled as reactive systems with nondeterministic and probabilistic dynamics. Model-based verification techniques have proved effective in the deployment of safety-critical CPSs. Central for a…
Design and control of autonomous systems that operate in uncertain or adversarial environments can be facilitated by formal modelling and analysis. Probabilistic model checking is a technique to automatically verify, for a given temporal…
This paper presents an online method that learns optimal decisions for a discrete time Markov decision problem with an opportunistic structure. The state at time $t$ is a pair $(S(t),W(t))$ where $S(t)$ takes values in a finite set…
The design of decision and control strategies for switched systems typically requires complete knowledge of (i) mathematical models of the subsystems and (ii) restrictions on admissible switches between the subsystems. We propose an active…
The goal of this paper is to analyze distributional Markov Decision Processes as a class of control problems in which the objective is to learn policies that steer the distribution of a cumulative reward toward a prescribed target law,…
Checklists have been widely recognized as effective tools for completing complex tasks in a systematic manner. Although originally intended for use in procedural tasks, their interpretability and ease of use have led to their adoption for…
Model learning has gained increasing interest in recent years. It derives behavioural models from test data of black-box systems. The main advantage offered by such techniques is that they enable model-based analysis without access to the…
This article presents a short and concise description of stochastic approximation algorithms in reinforcement learning of Markov decision processes. The algorithms can also be used as a suboptimal method for partially observed Markov…
Herein we suggest a mobile robot-training algorithm that is based on the preference approximation of the decision taker who controls the robot, which in its turn is managed by the Markov chain. Setup of the model parameters is made on the…
We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions-discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use…
General purpose intelligent learning agents cycle through (complex,non-MDP) sequences of observations, actions, and rewards. On the other hand, reinforcement learning is well-developed for small finite state Markov Decision Processes…
It is well known that options can make planning more efficient, among their many benefits. Thus far, algorithms for autonomously discovering a set of useful options were heuristic. Naturally, a principled way of finding a set of useful…
We present a system for interactive examination of learned security policies. It allows a user to traverse episodes of Markov decision processes in a controlled manner and to track the actions triggered by security policies. Similar to a…
We study the problem of learning Markov decision processes with finite state and action spaces when the transition probability distributions and loss functions are chosen adversarially and are allowed to change with time. We introduce an…
There are two common approaches for optimizing the performance of a machine: genetic algorithms and machine learning. A genetic algorithm is applied over many generations whereas machine learning works by applying feedback until the system…
There has been substantial progress in the inference of formal behavioural specifications from sample trajectories, for example, using Linear Temporal Logic (LTL). However, these techniques cannot handle specifications that correctly…
In this paper, we are interested in optimal decisions in a partially observable Markov universe. Our viewpoint departs from the dynamic programming viewpoint: we are directly approximating an optimal strategic tree depending on the…