English
Related papers

Related papers: Quantum Algorithm for Online Exp-concave Optimizat…

200 papers

We explore whether quantum advantages can be found for the zeroth-order online convex optimization problem, which is also known as bandit convex optimization with multi-point feedback. In this setting, given access to zeroth-order oracles…

Quantum Physics · Physics 2022-04-04 Jianhao He , Feidiao Yang , Jialin Zhang , Lvzhou Li

We consider online convex optimization with a zero-order oracle feedback. In particular, the decision maker does not know the explicit representation of the time-varying cost functions, or their gradients. At each time step, she observes…

Optimization and Control · Mathematics 2020-05-05 Tatiana Tatarenko , Maryam Kamgarpour

Bandit convex optimization (BCO) is a general framework for online decision making under uncertainty. While tight regret bounds for general convex losses have been established, existing algorithms achieving these bounds have prohibitive…

Machine Learning · Computer Science 2024-10-04 Arun Suggala , Y. Jennifer Sun , Praneeth Netrapalli , Elad Hazan

We consider the problem of online convex optimization against an arbitrary adversary with bandit feedback, known as bandit convex optimization. We give the first $\tilde{O}(\sqrt{T})$-regret algorithm for this setting based on a novel…

Machine Learning · Computer Science 2016-03-16 Elad Hazan , Yuanzhi Li

We study the problem of incentive-compatible online learning with bandit feedback. In this class of problems, the experts are self-interested agents who might misrepresent their preferences with the goal of being selected most often. The…

Machine Learning · Computer Science 2024-05-13 Julian Zimmert , Teodor V. Marinov

In this paper, we consider an online optimization process, where the objective functions are not convex (nor concave) but instead belong to a broad class of continuous submodular functions. We first propose a variant of the Frank-Wolfe…

Machine Learning · Statistics 2018-02-19 Lin Chen , Hamed Hassani , Amin Karbasi

In this work, we study the online convex optimization problem with curved losses and delayed feedback. When losses are strongly convex, existing approaches obtain regret bounds of order $d_{\max} \ln T$, where $d_{\max}$ is the maximum…

Machine Learning · Computer Science 2025-06-10 Hao Qiu , Emmanuel Esposito , Mengxiao Zhang

We develop a reduction-based framework for online learning with delayed feedback that recovers and improves upon existing results for both first-order and bandit convex optimization. Our approach introduces a continuous-time model under…

Machine Learning · Computer Science 2026-02-04 Alexander Ryabchenko , Idan Attias , Daniel M. Roy

In this paper, we study a special bandit setting of online stochastic linear optimization, where only one-bit of information is revealed to the learner at each round. This problem has found many applications including online advertisement…

Machine Learning · Computer Science 2015-09-28 Lijun Zhang , Tianbao Yang , Rong Jin , Zhi-Hua Zhou

We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions $\f_t$ admit a "pseudo-1d" structure, i.e. $\f_t(\w) = \loss_t(\pred_t(\w))$ where the output of $\pred_t$ is…

Machine Learning · Computer Science 2021-02-16 Aadirupa Saha , Nagarajan Natarajan , Praneeth Netrapalli , Prateek Jain

We investigate the problem of online convex optimization with unknown delays, in which the feedback of a decision arrives with an arbitrary delay. Previous studies have presented a delayed variant of online gradient descent (OGD), and…

Machine Learning · Computer Science 2021-03-23 Yuanyu Wan , Wei-Wei Tu , Lijun Zhang

In this paper, we analyze the problem of online convex optimization in different settings, including different feedback types (full-information/semi-bandit/bandit/etc) in either stochastic or non-stochastic setting and different notions of…

Machine Learning · Computer Science 2026-02-23 Mohammad Pedramfar , Vaneet Aggarwal

This paper studies bandit convex optimization in non-stationary environments with two-point feedback, using dynamic regret as the performance measure. We propose an algorithm based on bandit mirror descent that extends naturally to…

Optimization and Control · Mathematics 2026-05-26 Chang He , Bo Jiang , Shuzhong Zhang

In this paper, we investigate the online non-convex optimization problem which generalizes the classic {online convex optimization problem by relaxing the convexity assumption on the cost function. For this type of problem, the classic…

Machine Learning · Computer Science 2017-09-14 Lin Yang , Cheng Tan , Wing Shing Wong

We revisit the challenge of designing online algorithms for the bandit convex optimization problem (BCO) which are also scalable to high dimensional problems. Hence, we consider algorithms that are \textit{projection-free}, i.e., based on…

Machine Learning · Computer Science 2019-10-09 Dan Garber , Ben Kretzu

We study online reinforcement learning in linear Markov decision processes with adversarial losses and bandit feedback, without prior knowledge on transitions or access to simulators. We introduce two algorithms that achieve improved regret…

Machine Learning · Computer Science 2023-10-19 Haolin Liu , Chen-Yu Wei , Julian Zimmert

Motivated by applications in clinical trials and finance, we study the problem of online convex optimization (with bandit feedback) where the decision maker is risk-averse. We provide two algorithms to solve this problem. The first one is a…

Machine Learning · Computer Science 2018-10-02 Adrian Rivera Cardoso , Huan Xu

Motivated by applications to online learning in sparse estimation and Bayesian optimization, we consider the problem of online unconstrained nonsubmodular minimization with delayed costs in both full information and bandit feedback…

Machine Learning · Computer Science 2022-06-02 Tianyi Lin , Aldo Pacchiano , Yaodong Yu , Michael I. Jordan

We present an adaptive online gradient descent algorithm to solve online convex optimization problems with long-term constraints , which are constraints that need to be satisfied when accumulated over a finite number of rounds T , but can…

Machine Learning · Statistics 2015-12-24 Rodolphe Jenatton , Jim Huang , Cédric Archambeau

We consider the problem of online boosting for regression tasks, when only limited information is available to the learner. We give an efficient regret minimization method that has two implications: an online boosting algorithm with noisy…

Machine Learning · Computer Science 2020-07-24 Nataly Brukhim , Elad Hazan
‹ Prev 1 2 3 10 Next ›