English
Related papers

Related papers: Offline A/B testing for Recommender Systems

200 papers

Efficient methods to evaluate new algorithms are critical for improving interactive bandit and reinforcement learning systems such as recommendation systems. A/B tests are reliable, but are time- and money-consuming, and entail a risk of…

Machine Learning · Computer Science 2021-08-04 Yusuke Narita , Shota Yasui , Kohei Yata

Modern recommender systems face an increasing need to explain their recommendations. Despite considerable progress in this area, evaluating the quality of explanations remains a significant challenge for researchers and practitioners. Prior…

Artificial Intelligence · Computer Science 2022-11-18 Yuanshun Yao , Chong Wang , Hang Li

Counterfactual estimators are critical for learning and refining policies using logged data, a process known as Off-Policy Evaluation (OPE). OPE allows researchers to assess new policies without costly experiments, speeding up the…

Artificial Intelligence · Computer Science 2025-01-10 Ritam Guha , Nilavra Pathak

The evaluation of recommendation systems is a complex task. The offline and online evaluation metrics for recommender systems are ambiguous in their true objectives. The majority of recently published papers benchmark their methods using…

Information Retrieval · Computer Science 2023-08-15 Petr Kasalický , Rodrigo Alves , Pavel Kordík

Offline evaluations of recommender systems attempt to estimate users' satisfaction with recommendations using static data from prior user interactions. These evaluations provide researchers and developers with first approximations of the…

Information Retrieval · Computer Science 2020-01-28 Mucun Tian , Michael D. Ekstrand

Both in academic and industry-based research, online evaluation methods are seen as the golden standard for interactive applications like recommendation systems. Naturally, the reason for this is that we can directly measure utility metrics…

Information Retrieval · Computer Science 2022-09-20 Imad Aouali , Amine Benhalloum , Martin Bompaire , Benjamin Heymann , Olivier Jeunen , David Rohde , Otmane Sakhi , Flavian Vasile

Recommender systems exemplify sequential decision-making under uncertainty, strategically deciding what content to serve to users, to optimise a range of potential objectives. To balance the explore-exploit trade-off successfully, Thompson…

Information Retrieval · Computer Science 2025-07-09 Olivier Jeunen

Evaluation plays a crucial role in the development of ranking algorithms on search and recommender systems. It enables online platforms to create user-friendly features that drive commercial success in a steady and effective manner. The…

Information Retrieval · Computer Science 2025-08-04 Qing Zhang , Alex Deng , Michelle Du , Huiji Gao , Liwei He , Sanjeev Katariya

Offline policy optimization could have a large impact on many real-world decision-making problems, as online learning may be infeasible in many applications. Importance sampling and its variants are a commonly used type of estimator in…

Machine Learning · Computer Science 2022-07-05 Yao Liu , Yannis Flet-Berliac , Emma Brunskill

Accurately evaluating new policies (e.g. ad-placement models, ranking functions, recommendation functions) is one of the key prerequisites for improving interactive systems. While the conventional approach to evaluation relies on online A/B…

Machine Learning · Computer Science 2017-06-27 Aman Agarwal , Soumya Basu , Tobias Schnabel , Thorsten Joachims

In applying reinforcement learning (RL) to high-stakes domains, quantitative and qualitative evaluation using observational data can help practitioners understand the generalization performance of new policies. However, this type of…

Machine Learning · Computer Science 2023-10-27 Shengpu Tang , Jenna Wiens

A critical need for industrial recommender systems is the ability to evaluate recommendation policies offline, before deploying them to production. Unfortunately, widely used off-policy evaluation methods either make strong assumptions…

Machine Learning · Computer Science 2022-10-19 Alexander Buchholz , Ben London , Giuseppe di Benedetto , Thorsten Joachims

Even though offline evaluation is just an imperfect proxy of online performance -- due to the interactive nature of recommenders -- it will probably remain the primary way of evaluation in recommender systems research for the foreseeable…

Information Retrieval · Computer Science 2023-07-28 Balázs Hidasi , Ádám Tibor Czapp

We provide a comparative study of several widely used off-policy estimators (Empirical Average, Basic Importance Sampling and Normalized Importance Sampling), detailing the different regimes where they are individually suboptimal. We then…

Machine Learning · Statistics 2019-01-30 Thomas Nedelec , Nicolas Le Roux , Vianney Perchet

The ability to perform offline A/B-testing and off-policy learning using logged contextual bandit feedback is highly desirable in a broad range of applications, including recommender systems, search engines, ad placement, and personalized…

Machine Learning · Computer Science 2019-08-30 Yi Su , Lequn Wang , Michele Santacatterina , Thorsten Joachims

Offline evaluation plays a central role in benchmarking recommender systems when online testing is impractical or risky. However, it is susceptible to two key sources of bias: exposure bias, where users only interact with items they are…

Information Retrieval · Computer Science 2025-08-12 Bruno L. Pereira , Alan Said , Rodrygo L. T. Santos

We address the problem of A/B testing, a widely used protocol for evaluating the potential improvement achieved by a new decision system compared to a baseline. This protocol segments the population into two subgroups, each exposed to a…

Machine Learning · Statistics 2025-06-16 Otmane Sakhi , Alexandre Gilotte , David Rohde

Recommendation systems have been integrated into the majority of large online systems to filter and rank information according to user profiles. It thus influences the way users interact with the system and, as a consequence, bias the…

Information Retrieval · Computer Science 2015-11-05 Arnaud De Myttenaere , Boris Golden , Bénédicte Le Grand , Fabrice Rossi

Evaluating the causal effect of recommendations is an important objective because the causal effect on user interactions can directly leads to an increase in sales and user engagement. To select an optimal recommendation model, it is common…

Machine Learning · Computer Science 2021-07-16 Masahiro Sato

Recommender systems are widely used AI applications designed to help users efficiently discover relevant items. The effectiveness of such systems is tied to the satisfaction of both users and providers. However, user satisfaction is complex…

Information Retrieval · Computer Science 2024-11-05 Ali Elahi , Armin Zirak
‹ Prev 1 2 3 10 Next ›