English
Related papers

Related papers: RELEAF: An Algorithm for Learning and Exploiting R…

200 papers

Many sequential decision-making tasks require choosing at each decision step the right action out of the vast set of possibilities by extracting actionable intelligence from high-dimensional data streams. Most of the times, the…

Machine Learning · Computer Science 2020-12-29 Eralp Turgay , Cem Bulucu , Cem Tekin

Ranking algorithms are fundamental to various online platforms across e-commerce sites to content streaming services. Our research addresses the challenge of adaptively ranking items from a candidate pool for heterogeneous users, a key…

Machine Learning · Computer Science 2024-06-10 Jingyuan Wang , Perry Dong , Ying Jin , Ruohan Zhan , Zhengyuan Zhou

Taking advantage of contextual information can potentially boost the performance of recommender systems. In the era of big data, such side information often has several dimensions. Thus, developing decision-making algorithms to cope with…

Machine Learning · Computer Science 2023-07-26 Saeed Ghoorchian , Evgenii Kortukov , Setareh Maghsudi

We consider the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward. Instead, the learner can actively query an expert at each round to compare two actions and…

Machine Learning · Computer Science 2023-07-25 Ayush Sekhari , Karthik Sridharan , Wen Sun , Runzhe Wu

We introduce the problem of model selection for contextual bandits, where a learner must adapt to the complexity of the optimal policy while balancing exploration and exploitation. Our main result is a new model selection guarantee for…

Machine Learning · Computer Science 2019-11-15 Dylan J. Foster , Akshay Krishnamurthy , Haipeng Luo

The contextual duelling bandit problem models adaptive recommender systems, where the algorithm presents a set of items to the user, and the user's choice reveals their preference. This setup is well suited for implicit choices users make…

Machine Learning · Computer Science 2025-08-27 Suryanarayana Sankagiri , Jalal Etesami , Pouria Fatemi , Matthias Grossglauser

We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems. We wish to learn a hidden $d$-dimensional value $w^*$. Every round, we are presented with a…

Machine Learning · Computer Science 2021-06-10 Sreenivas Gollapudi , Guru Guruganesh , Kostas Kollias , Pasin Manurangsi , Renato Paes Leme , Jon Schneider

We consider a contextual online learning (multi-armed bandit) problem with high-dimensional covariate $\mathbf{x}$ and decision $\mathbf{y}$. The reward function to learn, $f(\mathbf{x},\mathbf{y})$, does not have a particular parametric…

Machine Learning · Computer Science 2022-10-04 Wenhao Li , Ningyuan Chen , L. Jeff Hong

We provide the first oracle efficient sublinear regret algorithms for adversarial versions of the contextual bandit problem. In this problem, the learner repeatedly makes an action on the basis of a context and receives reward for the…

Machine Learning · Computer Science 2016-02-09 Vasilis Syrgkanis , Akshay Krishnamurthy , Robert E. Schapire

In the classical multi-armed bandit problem, instance-dependent algorithms attain improved performance on "easy" problems with a gap between the best and second-best arm. Are similar guarantees possible for contextual bandits? While…

Machine Learning · Computer Science 2020-10-08 Dylan J. Foster , Alexander Rakhlin , David Simchi-Levi , Yunzong Xu

In digital health and EdTech, recommendation systems face a significant challenge: users often choose impulsively, in ways that conflict with the platform's long-term payoffs. This misalignment makes it difficult to effectively learn to…

Machine Learning · Computer Science 2024-02-22 Arpit Agarwal , Rad Niazadeh , Prathamesh Patil

Recommendation systems are a key modern application of machine learning, but they have the downside that they often draw upon sensitive user information in making their predictions. We show how to address this deficiency by basing a…

Machine Learning · Computer Science 2021-12-03 Naveen Durvasula , Franklyn Wang , Scott Duke Kominers

Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on the action and context. We consider this problem under a…

Machine Learning · Computer Science 2012-03-05 Alekh Agarwal , Miroslav Dudík , Satyen Kale , John Langford , Robert E. Schapire

We study the $K$-armed contextual dueling bandit problem, a sequential decision making setting in which the learner uses contextual information to make two decisions, but only observes \emph{preference-based feedback} suggesting that one…

Machine Learning · Computer Science 2021-11-25 Aadirupa Saha , Akshay Krishnamurthy

Thanks to the power of representation learning, neural contextual bandit algorithms demonstrate remarkable performance improvement against their classical counterparts. But because their exploration has to be performed in the entire neural…

Machine Learning · Computer Science 2022-03-22 Yiling Jia , Weitong Zhang , Dongruo Zhou , Quanquan Gu , Hongning Wang

Deep reinforcement learning has achieved impressive successes yet often requires a very large amount of interaction data. This result is perhaps unsurprising, as using complicated function approximation often requires more data to fit, and…

Machine Learning · Computer Science 2020-11-20 Jonathan N. Lee , Aldo Pacchiano , Vidya Muthukumar , Weihao Kong , Emma Brunskill

The linear contextual bandit literature is mostly focused on the design of efficient learning algorithms for a given representation. However, a contextual bandit problem may admit multiple linear representations, each one with different…

Machine Learning · Computer Science 2021-04-09 Matteo Papini , Andrea Tirinzoni , Marcello Restelli , Alessandro Lazaric , Matteo Pirotta

We consider the kernelized contextual bandit problem with a large feature space. This problem involves $K$ arms, and the goal of the forecaster is to maximize the cumulative rewards through learning the relationship between the contexts and…

Machine Learning · Statistics 2025-05-21 Shogo Iwazaki , Junpei Komiyama , Masaaki Imaizumi

Designing efficient general-purpose contextual bandit algorithms that work with large -- or even continuous -- action spaces would facilitate application to important scenarios such as information retrieval, recommendation systems, and…

Machine Learning · Computer Science 2022-07-14 Yinglun Zhu , Paul Mineiro

Contextual multi-armed bandit algorithms are widely used in sequential decision tasks such as news article recommendation systems, web page ad placement algorithms, and mobile health. Most of the existing algorithms have regret proportional…

Machine Learning · Statistics 2020-02-14 Gi-Soo Kim , Myunghee Cho Paik
‹ Prev 1 2 3 10 Next ›