Related papers: Policy Design for Active Sequential Hypothesis Tes…

Sequential Experiment Design for Hypothesis Verification

Hypothesis testing is an important problem with applications in target localization, clinical trials etc. Many active hypothesis testing strategies operate in two phases: an exploration phase and a verification phase. In the exploration…

Machine Learning · Statistics 2018-12-05 Dhruva Kartik , Ashutosh Nayyar , Urbashi Mitra

Active sequential hypothesis testing

Consider a decision maker who is responsible to dynamically collect observations so as to enhance his information about an underlying phenomena of interest in a speedy manner while accounting for the penalty of wrong declaration. Due to the…

Information Theory · Computer Science 2013-12-19 Mohammad Naghshvar , Tara Javidi

Optimizing Sequential Medical Treatments with Auto-Encoding Heuristic Search in POMDPs

Health-related data is noisy and stochastic in implying the true physiological states of patients, limiting information contained in single-moment observations for sequential clinical decision making. We model patient-clinician interactions…

Artificial Intelligence · Computer Science 2019-05-21 Luchen Li , Matthieu Komorowski , Aldo A. Faisal

Evaluating Active Learning Heuristics for Sequential Diagnosis

Given a malfunctioning system, sequential diagnosis aims at identifying the root cause of the failure in terms of abnormally behaving system components. As initial system observations usually do not suffice to deterministically pin down…

Artificial Intelligence · Computer Science 2022-08-08 Patrick Rodler , Wolfgang Schmid

Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach

Partially Observable Markov Decision Processes (POMDPs) are a powerful framework for planning under uncertainty. They allow to model state uncertainty as a belief probability distribution. Approximate solvers based on Monte Carlo sampling…

Artificial Intelligence · Computer Science 2024-03-01 Daniele Meli , Alberto Castellini , Alessandro Farinelli

Learning Heuristic Selection with Dynamic Algorithm Configuration

A key challenge in satisficing planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single…

Artificial Intelligence · Computer Science 2021-04-13 David Speck , André Biedenkapp , Frank Hutter , Robert Mattmüller , Marius Lindauer

Learning how to Active Learn: A Deep Reinforcement Learning Approach

Active learning aims to select a small subset of data for annotation such that a classifier learned on the data is highly accurate. This is usually done using heuristic selection methods, however the effectiveness of such methods is limited…

Computation and Language · Computer Science 2017-08-09 Meng Fang , Yuan Li , Trevor Cohn

Going Beyond Heuristics by Imposing Policy Improvement as a Constraint

In many reinforcement learning (RL) applications, augmenting the task rewards with heuristic rewards that encode human priors about how a task should be solved is crucial for achieving desirable performance. However, because such heuristics…

Machine Learning · Computer Science 2025-07-09 Chi-Chang Lee , Zhang-Wei Hong , Pulkit Agrawal

My Brain is Full: When More Memory Helps

We consider the problem of finding good finite-horizon policies for POMDPs under the expected reward metric. The policies considered are {em free finite-memory policies with limited memory}; a policy is a mapping from the space of…

Artificial Intelligence · Computer Science 2013-01-30 Christopher Lusena , Tong Li , Shelia Sittinger , Chris Wells , Judy Goldsmith

Active Hypothesis Testing: Beyond Chernoff-Stein

An active hypothesis testing problem is formulated. In this problem, the agent can perform a fixed number of experiments and then decide on one of the hypotheses. The agent is also allowed to declare its experiments inconclusive if needed.…

Information Theory · Computer Science 2019-01-23 Dhruva Kartik , Ashutosh Nayyar , Urbashi Mitra

Experience-Based Heuristic Search: Robust Motion Planning with Deep Q-Learning

Interaction-aware planning for autonomous driving requires an exploration of a combinatorial solution space when using conventional search- or optimization-based motion planners. With Deep Reinforcement Learning, optimal driving strategies…

Robotics · Computer Science 2021-02-08 Julian Bernhard , Robert Gieselmann , Klemens Esterle , Alois Knoll

Optimizing Sequential Experimental Design with Deep Reinforcement Learning

Bayesian approaches developed to solve the optimal design of sequential experiments are mathematically elegant but computationally challenging. Recently, techniques using amortization have been proposed to make these Bayesian approaches…

Machine Learning · Computer Science 2022-06-20 Tom Blau , Edwin V. Bonilla , Iadine Chades , Amir Dezfouli

Learning in POMDPs is Sample-Efficient with Hindsight Observability

POMDPs capture a broad class of decision making problems, but hardness results suggest that learning is intractable even in simple settings due to the inherent partial observability. However, in many realistic problems, more information is…

Machine Learning · Computer Science 2023-02-07 Jonathan N. Lee , Alekh Agarwal , Christoph Dann , Tong Zhang

Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability Objectives

Partially Observable Markov Decision Processes (POMDPs) are powerful models for sequential decision making under transition and observation uncertainties. This paper studies the challenging yet important problem in POMDPs known as the…

Artificial Intelligence · Computer Science 2024-06-06 Qi Heng Ho , Martin S. Feather , Federico Rossi , Zachary N. Sunberg , Morteza Lahijanian

Strengthening Deterministic Policies for POMDPs

The synthesis problem for partially observable Markov decision processes (POMDPs) is to compute a policy that satisfies a given specification. Such policies have to take the full execution history of a POMDP into account, rendering the…

Artificial Intelligence · Computer Science 2020-07-20 Leonore Winterer , Ralf Wimmer , Nils Jansen , Bernd Becker

Myopic Policy Bounds for Information Acquisition POMDPs

This paper addresses the problem of optimal control of robotic sensing systems aimed at autonomous information gathering in scenarios such as environmental monitoring, search and rescue, and surveillance and reconnaissance. The information…

Systems and Control · Computer Science 2016-01-28 Mikko Lauri , Nikolay Atanasov , George J. Pappas , Risto Ritala

Utility Maximizing Sequential Sensing Over a Finite Horizon

We consider the problem of optimally utilizing $N$ resources, each in an unknown binary state. The state of each resource can be inferred from state-dependent noisy measurements. Depending on its state, utilizing a resource results in…

Systems and Control · Computer Science 2017-05-18 Lorenzo Ferrari , Qing Zhao , Anna Scaglione

POMDPs in Continuous Time and Discrete Spaces

Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space…

Machine Learning · Computer Science 2020-10-27 Bastian Alt , Matthias Schultheis , Heinz Koeppl

Heuristics for Partially Observable Stochastic Contingent Planning

Acting to complete tasks in stochastic partially observable domains is an important problem in artificial intelligence, and is often formulated as a goal-based POMDP. Goal-based POMDPs can be solved using the RTDP-BEL algorithm, that…

Artificial Intelligence · Computer Science 2024-10-10 Guy Shani

Plan of Thoughts: Heuristic-Guided Problem Solving with Large Language Models

While language models (LMs) offer significant capability in zero-shot reasoning tasks across a wide range of domains, they do not perform satisfactorily in problems which requires multi-step reasoning. Previous approaches to mitigate this…

Computation and Language · Computer Science 2024-05-01 Houjun Liu