Related papers: Policy Design for Active Sequential Hypothesis Tes…
Hypothesis testing is an important problem with applications in target localization, clinical trials etc. Many active hypothesis testing strategies operate in two phases: an exploration phase and a verification phase. In the exploration…
Consider a decision maker who is responsible to dynamically collect observations so as to enhance his information about an underlying phenomena of interest in a speedy manner while accounting for the penalty of wrong declaration. Due to the…
Health-related data is noisy and stochastic in implying the true physiological states of patients, limiting information contained in single-moment observations for sequential clinical decision making. We model patient-clinician interactions…
Given a malfunctioning system, sequential diagnosis aims at identifying the root cause of the failure in terms of abnormally behaving system components. As initial system observations usually do not suffice to deterministically pin down…
Partially Observable Markov Decision Processes (POMDPs) are a powerful framework for planning under uncertainty. They allow to model state uncertainty as a belief probability distribution. Approximate solvers based on Monte Carlo sampling…
A key challenge in satisficing planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single…
Active learning aims to select a small subset of data for annotation such that a classifier learned on the data is highly accurate. This is usually done using heuristic selection methods, however the effectiveness of such methods is limited…
In many reinforcement learning (RL) applications, augmenting the task rewards with heuristic rewards that encode human priors about how a task should be solved is crucial for achieving desirable performance. However, because such heuristics…
We consider the problem of finding good finite-horizon policies for POMDPs under the expected reward metric. The policies considered are {em free finite-memory policies with limited memory}; a policy is a mapping from the space of…
An active hypothesis testing problem is formulated. In this problem, the agent can perform a fixed number of experiments and then decide on one of the hypotheses. The agent is also allowed to declare its experiments inconclusive if needed.…
Interaction-aware planning for autonomous driving requires an exploration of a combinatorial solution space when using conventional search- or optimization-based motion planners. With Deep Reinforcement Learning, optimal driving strategies…
Bayesian approaches developed to solve the optimal design of sequential experiments are mathematically elegant but computationally challenging. Recently, techniques using amortization have been proposed to make these Bayesian approaches…
POMDPs capture a broad class of decision making problems, but hardness results suggest that learning is intractable even in simple settings due to the inherent partial observability. However, in many realistic problems, more information is…
Partially Observable Markov Decision Processes (POMDPs) are powerful models for sequential decision making under transition and observation uncertainties. This paper studies the challenging yet important problem in POMDPs known as the…
The synthesis problem for partially observable Markov decision processes (POMDPs) is to compute a policy that satisfies a given specification. Such policies have to take the full execution history of a POMDP into account, rendering the…
This paper addresses the problem of optimal control of robotic sensing systems aimed at autonomous information gathering in scenarios such as environmental monitoring, search and rescue, and surveillance and reconnaissance. The information…
We consider the problem of optimally utilizing $N$ resources, each in an unknown binary state. The state of each resource can be inferred from state-dependent noisy measurements. Depending on its state, utilizing a resource results in…
Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space…
Acting to complete tasks in stochastic partially observable domains is an important problem in artificial intelligence, and is often formulated as a goal-based POMDP. Goal-based POMDPs can be solved using the RTDP-BEL algorithm, that…
While language models (LMs) offer significant capability in zero-shot reasoning tasks across a wide range of domains, they do not perform satisfactorily in problems which requires multi-step reasoning. Previous approaches to mitigate this…