Related papers: Contextual Decision Trees

A Practical Method for Solving Contextual Bandit Problems Using Decision Trees

Many efficient algorithms with strong theoretical guarantees have been proposed for the contextual multi-armed bandit problem. However, applying these algorithms in practice can be difficult because they require domain expertise to build…

Machine Learning · Computer Science 2018-10-23 Adam N. Elmachtoub , Ryan McNellis , Sechan Oh , Marek Petrik

Random Forest for the Contextual Bandit Problem - extended version

To address the contextual bandit problem, we propose an online random forest algorithm. The analysis of the proposed algorithm is based on the sample complexity needed to find the optimal decision stump. Then, the decision stumps are…

Machine Learning · Computer Science 2016-09-16 Raphaël Féraud , Robin Allesiardo , Tanguy Urvoy , Fabrice Clérot

Learning Contextual Bandits in a Non-stationary Environment

Multi-armed bandit algorithms have become a reference solution for handling the explore/exploit dilemma in recommender systems, and many other important real-world problems, such as display advertisement. However, such algorithms usually…

Machine Learning · Computer Science 2018-05-25 Qingyun Wu , Naveen Iyer , Hongning Wang

Tree Ensembles for Contextual Bandits

We propose a new framework for contextual multi-armed bandits based on tree ensembles. Our framework adapts two widely used bandit methods, Upper Confidence Bound and Thompson Sampling, for both standard and combinatorial settings. As part…

Machine Learning · Computer Science 2025-12-04 Hannes Nilsson , Rikard Johansson , Niklas Åkerblom , Morteza Haghir Chehreghani

Multi-Task Learning for Contextual Bandits

Contextual bandits are a form of multi-armed bandit in which the agent has access to predictive side information (known as the context) for each arm at each time step, and have been used to model personalized news recommendation, ad…

Machine Learning · Statistics 2017-05-25 Aniket Anand Deshmukh , Urun Dogan , Clayton Scott

Leveraging heterogeneous spillover in maximizing contextual bandit rewards

Recommender systems relying on contextual multi-armed bandits continuously improve relevant item recommendations by taking into account the contextual information. The objective of bandit algorithms is to learn the best arm (e.g., best item…

Machine Learning · Computer Science 2025-12-10 Ahmed Sayeed Faruk , Elena Zheleva

Top-K Ranking Deep Contextual Bandits for Information Selection Systems

In today's technology environment, information is abundant, dynamic, and heterogeneous in nature. Automated filtering and prioritization of information is based on the distinction between whether the information adds substantial value…

Machine Learning · Computer Science 2022-02-01 Jade Freeman , Michael Rawson

Selectively Contextual Bandits

Contextual bandits are widely used in industrial personalization systems. These online learning frameworks learn a treatment assignment policy in the presence of treatment effects that vary with the observed contextual features of the…

Machine Learning · Computer Science 2022-05-11 Claudia Roberts , Maria Dimakopoulou , Qifeng Qiao , Ashok Chandrashekhar , Tony Jebara

Contextual-Bandit Based Personalized Recommendation with Time-Varying User Interests

A contextual bandit problem is studied in a highly non-stationary environment, which is ubiquitous in various recommender systems due to the time-varying interests of users. Two models with disjoint and hybrid payoffs are considered to…

Machine Learning · Computer Science 2020-03-03 Xiao Xu , Fang Dong , Yanghua Li , Shaojian He , Xin Li

An Asymptotically Optimal Contextual Bandit Algorithm Using Hierarchical Structures

We propose online algorithms for sequential learning in the contextual multi-armed bandit setting. Our approach is to partition the context space and then optimally combine all of the possible mappings between the partition regions and the…

Machine Learning · Computer Science 2017-12-11 Mohammadreza Mohaghegh Neyshabouri , Kaan Gokcesu , Huseyin Ozkan , Suleyman S. Kozat

Contextual Bandit with Adaptive Feature Extraction

We consider an online decision making setting known as contextual bandit problem, and propose an approach for improving contextual bandit performance by using an adaptive feature extraction (representation learning) based on online…

Artificial Intelligence · Computer Science 2020-09-15 Baihan Lin , Djallel Bouneffouf , Guillermo Cecchi , Irina Rish

contextual: Evaluating Contextual Multi-Armed Bandit Problems in R

Over the past decade, contextual bandit algorithms have been gaining in popularity due to their effectiveness and flexibility in solving sequential decision problems---from online advertising and finance to clinical trial design and…

Machine Learning · Computer Science 2020-01-03 Robin van Emden , Maurits Kaptein

Online learning with Corrupted context: Corrupted Contextual Bandits

We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the context used at each decision may be corrupted ("useless context"). This…

Machine Learning · Computer Science 2020-06-30 Djallel Bouneffouf

Contextual Bandit Learning with Predictable Rewards

Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on the action and context. We consider this problem under a…

Machine Learning · Computer Science 2012-03-05 Alekh Agarwal , Miroslav Dudík , Satyen Kale , John Langford , Robert E. Schapire

Neural Dueling Bandits: Preference-Based Optimization with Human Feedback

Contextual dueling bandit is used to model the bandit problems, where a learner's goal is to find the best arm for a given context using observed noisy human preference feedback over the selected arms for the past contexts. However,…

Machine Learning · Computer Science 2025-04-17 Arun Verma , Zhongxiang Dai , Xiaoqiang Lin , Patrick Jaillet , Bryan Kian Hsiang Low

Bayesian Non-stationary Linear Bandits for Large-Scale Recommender Systems

Taking advantage of contextual information can potentially boost the performance of recommender systems. In the era of big data, such side information often has several dimensions. Thus, developing decision-making algorithms to cope with…

Machine Learning · Computer Science 2023-07-26 Saeed Ghoorchian , Evgenii Kortukov , Setareh Maghsudi

Contextual Bandit with Missing Rewards

We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the reward associated with each context-based decision may not always be…

Machine Learning · Computer Science 2020-07-21 Djallel Bouneffouf , Sohini Upadhyay , Yasaman Khazaeni

Learning with Exposure Constraints in Recommendation Systems

Recommendation systems are dynamic economic systems that balance the needs of multiple stakeholders. A recent line of work studies incentives from the content providers' point of view. Content providers, e.g., vloggers and bloggers,…

Machine Learning · Computer Science 2023-11-13 Omer Ben-Porat , Rotem Torkan

Context Attentive Bandits: Contextual Bandit with Restricted Context

We consider a novel formulation of the multi-armed bandit model, which we call the contextual bandit with restricted context, where only a limited number of features can be accessed by the learner at every iteration. This novel formulation…

Artificial Intelligence · Computer Science 2017-06-09 Djallel Bouneffouf , Irina Rish , Guillermo A. Cecchi , Raphael Feraud

Contextual Information-Directed Sampling

Information-directed sampling (IDS) has recently demonstrated its potential as a data-efficient reinforcement learning algorithm. However, it is still unclear what is the right form of information ratio to optimize when contextual…

Machine Learning · Computer Science 2022-06-10 Botao Hao , Tor Lattimore , Chao Qin