Related papers: Bayesian Decision Making around Experts

Towards Bayesian Data Selection

A wide range of machine learning algorithms iteratively add data to the training sample. Examples include semi-supervised learning, active learning, multi-armed bandits, and Bayesian optimization. We embed this kind of data addition into…

Machine Learning · Statistics 2024-06-25 Julian Rodemann

Defensive Universal Learning with Experts

This paper shows how universal learning can be achieved with expert advice. To this aim, we specify an experts algorithm with the following characteristics: (a) it uses only feedback from the actions actually chosen (bandit setup), (b) it…

Machine Learning · Computer Science 2007-05-23 Jan Poland , Marcus Hutter

Bayesian Incentive-Compatible Bandit Exploration

Individual decision-makers consume information revealed by the previous decision makers, and produce information that may help in future decisions. This phenomenon is common in a wide range of scenarios in the Internet economy, as well as…

Computer Science and Game Theory · Computer Science 2019-05-06 Yishay Mansour , Aleksandrs Slivkins , Vasilis Syrgkanis

Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

Agents that interact with other agents often do not know a priori what the other agents' strategies are, but have to maximise their own online return while interacting with and learning about others. The optimal adaptive behaviour under…

Machine Learning · Computer Science 2022-04-19 Luisa Zintgraf , Sam Devlin , Kamil Ciosek , Shimon Whiteson , Katja Hofmann

Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits

We study the evolution of information in interactive decision making through the lens of a stochastic multi-armed bandit problem. Focusing on a fundamental example where a unique optimal arm outperforms the rest by a fixed margin, we…

Machine Learning · Statistics 2025-10-23 Yuzhou Gu , Yanjun Han , Jian Qian

Preference-based Online Learning with Dueling Bandits: A Survey

In machine learning, the notion of multi-armed bandits refers to a class of online learning problems, in which an agent is supposed to simultaneously explore and exploit a given set of choice alternatives in the course of a sequential…

Machine Learning · Computer Science 2021-07-13 Viktor Bengs , Robert Busa-Fekete , Adil El Mesaoudi-Paul , Eyke Hüllermeier

Social Teaching: Being Informative vs. Being Right in Sequential Decision Making

We show that it can be suboptimal for Bayesian decision-making agents employing social learning to use correct prior probabilities as their initial beliefs. We consider sequential Bayesian binary hypothesis testing where each individual…

Information Theory · Computer Science 2026-03-12 Joong Bum Rhim , Vivek K Goyal

Machine Teaching of Active Sequential Learners

Machine teaching addresses the problem of finding the best training data that can guide a learning algorithm to a target model with minimal effort. In conventional settings, a teacher provides data that are consistent with the true data…

Machine Learning · Computer Science 2019-11-04 Tomi Peltola , Mustafa Mert Çelikok , Pedram Daee , Samuel Kaski

No-Regret and Incentive-Compatible Online Learning

We study online learning settings in which experts act strategically to maximize their influence on the learning algorithm's predictions by potentially misreporting their beliefs about a sequence of binary events. Our goal is twofold.…

Machine Learning · Computer Science 2020-07-02 Rupert Freeman , David M. Pennock , Chara Podimata , Jennifer Wortman Vaughan

Learning to Use Learners' Advice

In this paper, we study a variant of the framework of online learning using expert advice with limited/bandit feedback. We consider each expert as a learning entity, seeking to more accurately reflecting certain real-world applications. In…

Machine Learning · Computer Science 2017-02-21 Adish Singla , Hamed Hassani , Andreas Krause

Bayesian multitask inverse reinforcement learning

We generalise the problem of inverse reinforcement learning to multiple tasks, from multiple demonstrations. Each one may represent one expert trying to solve a different task, or as different experts trying to solve the same task. Our main…

Machine Learning · Statistics 2012-09-04 Christos Dimitrakakis , Constantin Rothkopf

Dying Experts: Efficient Algorithms with Optimal Regret Bounds

We study a variant of decision-theoretic online learning in which the set of experts that are available to Learner can shrink over time. This is a restricted version of the well-studied sleeping experts problem, itself a generalization of…

Machine Learning · Computer Science 2019-10-31 Hamid Shayestehmanesh , Sajjad Azami , Nishant A. Mehta

From Bandits to Experts: On the Value of Side-Observations

We consider an adversarial online learning setting where a decision maker can choose an action in every stage of the game. In addition to observing the reward of the chosen action, the decision maker gets side observations on the reward he…

Machine Learning · Computer Science 2011-10-26 Shie Mannor , Ohad Shamir

Exploiting Expertise of Non-Expert and Diverse Agents in Social Bandit Learning: A Free Energy Approach

Personalized AI-based services involve a population of individual reinforcement learning agents. However, most reinforcement learning algorithms focus on harnessing individual learning and fail to leverage the social learning capabilities…

Machine Learning · Computer Science 2026-03-13 Erfan Mirzaei , Seyed Pooya Shariatpanahi , Alireza Tavakoli , Reshad Hosseini , Majid Nili Ahmadabadi

Identifiable Latent Bandits: Leveraging observational data for personalized decision-making

Sequential decision-making algorithms such as multi-armed bandits can find optimal personalized decisions, but are notoriously sample-hungry. In personalized medicine, for example, training a bandit from scratch for every patient is…

Machine Learning · Computer Science 2026-05-12 Ahmet Zahid Balcıoğlu , Newton Mwai , Emil Carlsson , Fredrik D. Johansson

Targeted Active Learning for Bayesian Decision-Making

Active learning is usually applied to acquire labels of informative data points in supervised learning, to maximize accuracy in a sample-efficient way. However, maximizing the accuracy is not the end goal when the results are used for…

Machine Learning · Statistics 2021-10-22 Louis Filstroff , Iiris Sundin , Petrus Mikkola , Aleksei Tiulpin , Juuso Kylmäoja , Samuel Kaski

Decision Market Based Learning For Multi-agent Contextual Bandit Problems

Information is often stored in a distributed and proprietary form, and agents who own information are often self-interested and require incentives to reveal their information. Suitable mechanisms are required to elicit and aggregate such…

Multiagent Systems · Computer Science 2022-12-02 Wenlong Wang , Thomas Pfeiffer

Leveraging Demonstrations to Improve Online Learning: Quality Matters

We investigate the extent to which offline demonstration data can improve online learning. It is natural to expect some improvement, but the question is how, and by how much? We show that the degree of improvement must depend on the quality…

Machine Learning · Computer Science 2023-05-18 Botao Hao , Rahul Jain , Tor Lattimore , Benjamin Van Roy , Zheng Wen

Expert-guided Bayesian Optimisation for Human-in-the-loop Experimental Design of Known Systems

Domain experts often possess valuable physical insights that are overlooked in fully automated decision-making processes such as Bayesian optimisation. In this article we apply high-throughput (batch) Bayesian optimisation alongside…

Machine Learning · Computer Science 2023-12-06 Tom Savage , Ehecatl Antonio del Rio Chanona

Optimal Learning for Sequential Decision Making for Expensive Cost Functions with Stochastic Binary Feedbacks

We consider the problem of sequentially making decisions that are rewarded by "successes" and "failures" which can be predicted through an unknown relationship that depends on a partially controllable vector of attributes for each instance.…

Machine Learning · Statistics 2017-09-18 Yingfei Wang , Chu Wang , Warren Powell