Related papers: Adaptive Hedging under Delayed Feedback

Online Aggregation of Unbounded Losses Using Shifting Experts with Confidence

We develop the setting of sequential prediction based on shifting experts and on a "smooth" version of the method of specialized experts. To aggregate experts predictions, we use the AdaHedge algorithm, which is a version of the Hedge…

Machine Learning · Computer Science 2020-01-24 Vladimir V'yugin , Vladimir Trunov

A new Hedging algorithm and its application to inferring latent random variables

We present a new online learning algorithm for cumulative discounted gain. This learning algorithm does not use exponential weights on the experts. Instead, it uses a weighting scheme that depends on the regret of the master algorithm…

Computer Science and Game Theory · Computer Science 2008-07-01 Yoav Freund , Daniel Hsu

An aggregating strategy for shifting experts in discrete sequence prediction

We study how we can adapt a predictor to a non-stationary environment with advises from multiple experts. We study the problem under complete feedback when the best expert changes over time from a decision theoretic point of view. Proposed…

Machine Learning · Computer Science 2017-08-08 Vishnu Raj , Sheetal Kalyani

Optimistic and Adaptive Lagrangian Hedging

In online learning an algorithm plays against an environment with losses possibly picked by an adversary at each round. The generality of this framework includes problems that are not adversarial, for example offline optimization, or saddle…

Machine Learning · Computer Science 2021-02-04 Ryan D'Orazio , Ruitong Huang

Prediction with Expert Advice under Discounted Loss

We study prediction with expert advice in the setting where the losses are accumulated with some discounting---the impact of old losses may gradually vanish. We generalize the Aggregating Algorithm and the Aggregating Algorithm for…

Machine Learning · Computer Science 2010-06-07 Alexey Chernov , Fedor Zhdanov

On the optimality of the Hedge algorithm in the stochastic regime

In this paper, we study the behavior of the Hedge algorithm in the online stochastic setting. We prove that anytime Hedge with decreasing learning rate, which is one of the simplest algorithm for the problem of prediction with expert…

Machine Learning · Statistics 2019-07-10 Jaouad Mourtada , Stéphane Gaïffas

A Generalized Online Algorithm for Translation and Scale Invariant Prediction with Expert Advice

In this work, we aim to create a completely online algorithmic framework for prediction with expert advice that is translation-free and scale-free of the expert losses. Our goal is to create a generalized algorithm that is suitable for use…

Machine Learning · Computer Science 2020-09-10 Kaan Gokcesu , Hakan Gokcesu

Adaptive Hedge

Most methods for decision-theoretic online learning are based on the Hedge algorithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case…

Machine Learning · Statistics 2015-03-04 Tim van Erven , Peter Grünwald , Wouter M. Koolen , Steven de Rooij

Delayed Feedback in Generalised Linear Bandits Revisited

The stochastic generalised linear bandit is a well-understood model for sequential decision-making problems, with many algorithms achieving near-optimal regret guarantees under immediate feedback. However, the stringent requirement for…

Machine Learning · Computer Science 2023-04-12 Benjamin Howson , Ciara Pike-Burke , Sarah Filippi

Adaptive Experimentation with Delayed Binary Feedback

Conducting experiments with objectives that take significant delays to materialize (e.g. conversions, add-to-cart events, etc.) is challenging. Although the classical "split sample testing" is still valid for the delayed feedback, the…

Information Retrieval · Computer Science 2022-02-03 Zenan Wang , Carlos Carrion , Xiliang Lin , Fuhua Ji , Yongjun Bao , Weipeng Yan

Adapting to Delays and Data in Adversarial Multi-Armed Bandits

We consider the adversarial multi-armed bandit problem under delayed feedback. We analyze variants of the Exp3 algorithm that tune their step-size using only information (about the losses and delays) available at the time of the decisions,…

Machine Learning · Computer Science 2020-10-14 András György , Pooria Joulani

Online Algorithm for Aggregating Experts' Predictions with Unbounded Quadratic Loss

We consider the problem of online aggregation of expert predictions with the quadratic loss function. We propose an algorithm for aggregating expert predictions which does not require a prior knowledge of the upper bound on the losses. The…

Machine Learning · Computer Science 2025-01-14 Alexander Korotin , Vladimir V'yugin , Evgeny Burnaev

A Second-order Bound with Excess Losses

We study online aggregation of the predictions of experts, and first show new second-order regret bounds in the standard setting, which are obtained via a version of the Prod algorithm (and also a version of the polynomially weighted…

Machine Learning · Statistics 2014-02-11 Pierre Gaillard , Gilles Stoltz , Tim Van Erven

Adaptive Online Prediction by Following the Perturbed Leader

When applying aggregating strategies to Prediction with Expert Advice, the learning rate must be adaptively tuned. The natural choice of sqrt(complexity/current loss) renders the analysis of Weighted Majority derivatives quite complicated.…

Artificial Intelligence · Computer Science 2007-05-23 Marcus Hutter , Jan Poland

Prediction with expert evaluators' advice

We introduce a new protocol for prediction with expert advice in which each expert evaluates the learner's and his own performance using a loss function that may change over time and may be different from the loss functions used by the…

Machine Learning · Computer Science 2009-03-23 Alexey Chernov , Vladimir Vovk

Adaptive and Efficient Algorithms for Tracking the Best Expert

In this paper, we consider the problem of prediction with expert advice in dynamic environments. We choose tracking regret as the performance metric and develop two adaptive and efficient algorithms with data-dependent tracking regret…

Machine Learning · Computer Science 2020-02-11 Shiyin Lu , Lijun Zhang

Private Online Prediction from Experts: Separations and Faster Rates

Online prediction from experts is a fundamental problem in machine learning and several works have studied this problem under privacy constraints. We propose and analyze new algorithms for this problem that improve over the regret bounds of…

Machine Learning · Computer Science 2023-07-03 Hilal Asi , Vitaly Feldman , Tomer Koren , Kunal Talwar

Aggregating Strategies for Long-term Forecasting

The article is devoted to investigating the application of aggregating algorithms to the problem of the long-term forecasting. We examine the classic aggregating algorithms based on the exponential reweighing. For the general Vovk's…

Machine Learning · Computer Science 2019-02-27 Alexander Korotin , Vladimir V'yugin , Evgeny Burnaev

Handling Delayed Feedback in Distributed Online Optimization : A Projection-Free Approach

Learning at the edges has become increasingly important as large quantities of data are continually generated locally. Among others, this paradigm requires algorithms that are simple (so that they can be executed by local devices), robust…

Machine Learning · Computer Science 2024-02-06 Tuan-Anh Nguyen , Nguyen Kim Thang , Denis Trystram

Biased Dueling Bandits with Stochastic Delayed Feedback

The dueling bandit problem, an essential variation of the traditional multi-armed bandit problem, has become significantly prominent recently due to its broad applications in online advertising, recommendation systems, information…

Machine Learning · Computer Science 2025-04-08 Bongsoo Yi , Yue Kang , Yao Li