Related papers: Exploring Recommender System Evaluation: A Multi-M…

AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents

A/B testing experiment is a widely adopted method for evaluating UI/UX design decisions in modern web applications. Yet, traditional A/B testing remains constrained by its dependence on the large-scale and live traffic of human…

Human-Computer Interaction · Computer Science 2026-03-12 Yuxuan Lu , Ting-Yao Hsu , Hansu Gu , Limeng Cui , Yaochen Xie , William Headden , Bingsheng Yao , Akash Veeragouni , Jiapeng Liu , Sreyashi Nag , Jessie Wang , Dakuo Wang

Beyond Offline A/B Testing: Context-Aware Agent Simulation for Recommender System Evaluation

Recommender systems are central to online services, enabling users to navigate through massive amounts of content across various domains. However, their evaluation remains challenging due to the disconnect between offline metrics and online…

Information Retrieval · Computer Science 2026-04-14 Nicolas Bougie , Gian Maria Marconi , Xiaotong Ye , Narimasa Watanabe

An Online A/B Testing Decision Support System for Web Usability Assessment Based on a Linguistic Decision-making Methodology: Case of Study a Virtual Learning Environment

In recent years, attention has increasingly focused on enhancing user satisfaction with user interfaces, spanning both mobile applications and websites. One fundamental aspect of human-machine interaction is the concept of web usability. In…

Software Engineering · Computer Science 2025-07-17 Noe Zermeño , Cristina Zuheros , Lucas Daniel Del Rosso Calache , Francisco Herrera , Rosana Montes

LLM-Based Multi-Agent System for Simulating and Analyzing Marketing and Consumer Behavior

Simulating consumer decision-making is vital for designing and evaluating marketing strategies before costly real-world deployment. However, post-event analyses and rule-based agent-based models (ABMs) struggle to capture the complexity of…

Artificial Intelligence · Computer Science 2025-10-22 Man-Lin Chu , Lucian Terhorst , Kadin Reed , Tom Ni , Weiwei Chen , Rongyu Lin

AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems

The emergence of agentic recommender systems powered by Large Language Models (LLMs) represents a paradigm shift in personalized recommendations, leveraging LLMs' advanced reasoning and role-playing capabilities to enable autonomous,…

Information Retrieval · Computer Science 2025-05-29 Yu Shang , Peijie Liu , Yuwei Yan , Zijing Wu , Leheng Sheng , Yuanqing Yu , Chumeng Jiang , An Zhang , Fengli Xu , Yu Wang , Min Zhang , Yong Li

Beyond Static Testbeds: An Interaction-Centric Agent Simulation Platform for Dynamic Recommender Systems

Evaluating and iterating upon recommender systems is crucial, yet traditional A/B testing is resource-intensive, and offline methods struggle with dynamic user-platform interactions. While agent-based simulation is promising, existing…

Computation and Language · Computer Science 2025-09-29 Song Jin , Juntian Zhang , Yuhan Liu , Xun Zhang , Yufei Zhang , Guojun Yin , Fei Jiang , Wei Lin , Rui Yan

Agent-Based Modelling Meets Generative AI in Social Network Simulations

Agent-Based Modelling (ABM) has emerged as an essential tool for simulating social networks, encompassing diverse phenomena such as information dissemination, influence dynamics, and community formation. However, manually configuring varied…

Social and Information Networks · Computer Science 2024-11-26 Antonino Ferraro , Antonio Galli , Valerio La Gatta , Marco Postiglione , Gian Marco Orlando , Diego Russo , Giuseppe Riccio , Antonio Romano , Vincenzo Moscato

SimUSER: Simulating User Behavior with Large Language Models for Recommender System Evaluation

Recommender systems play a central role in numerous real-life applications, yet evaluating their performance remains a significant challenge due to the gap between offline metrics and online behaviors. Given the scarcity and limits (e.g.,…

Information Retrieval · Computer Science 2025-04-18 Nicolas Bougie , Narimasa Watanabe

On Generative Agents in Recommendation

Recommender systems are the cornerstone of today's information dissemination, yet a disconnect between offline metrics and online performance greatly hinders their development. Addressing this challenge, we envision a recommendation…

Information Retrieval · Computer Science 2024-11-11 An Zhang , Yuxin Chen , Leheng Sheng , Xiang Wang , Tat-Seng Chua

An Automated Multi-modal Evaluation Framework for Mobile Intelligent Assistants Based on Large Language Models and Multi-Agent Collaboration

With the rapid development of mobile intelligent assistant technologies, multi-modal AI assistants have become essential interfaces for daily user interactions. However, current evaluation methods face challenges including high manual…

Artificial Intelligence · Computer Science 2025-10-22 Meiping Wang , Jian Zhong , Rongduo Han , Liming Kang , Zhengkun Shi , Xiao Liang , Xing Lin , Nan Gao , Haining Zhang

A Survey on LLM-powered Agents for Recommender Systems

Recommender systems are essential components of many online platforms, yet traditional approaches still struggle with understanding complex user preferences and providing explainable recommendations. The emergence of Large Language Model…

Information Retrieval · Computer Science 2025-03-05 Qiyao Peng , Hongtao Liu , Hua Huang , Qing Yang , Minglai Shao

A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments

Environments built for people are increasingly operated by a new class of economic actors: LLM-powered software agents making decisions on our behalf. These decisions range from our purchases to travel plans to medical treatment selection.…

Artificial Intelligence · Computer Science 2026-02-25 Manuel Cherep , Chengtian Ma , Abigail Xu , Maya Shaked , Pattie Maes , Nikhil Singh

Agentic Feedback Loop Modeling Improves Recommendation and User Simulation

Large language model-based agents are increasingly applied in the recommendation field due to their extensive knowledge and strong planning capabilities. While prior research has primarily focused on enhancing either the recommendation…

Information Retrieval · Computer Science 2025-05-05 Shihao Cai , Jizhi Zhang , Keqin Bao , Chongming Gao , Qifan Wang , Fuli Feng , Xiangnan He

SimAB: Simulating A/B Tests with Persona-Conditioned AI Agents for Rapid Design Evaluation

A/B testing is a standard method for validating design decisions, yet its reliance on real user traffic limits iteration speed and makes certain experiments impractical. We present SimAB, a system that reframes A/B testing as a fast,…

Human-Computer Interaction · Computer Science 2026-03-03 Tim Rieder , Marian Schneider , Mario Truss , Vitaly Tsaplin , Alina Rublea , Sinem Dere , Francisco Chicharro Sanz , Tobias Reiss , Mustafa Doga Dogan

Understanding Longitudinal Dynamics of Recommender Systems with Agent-Based Modeling and Simulation

Today's research in recommender systems is largely based on experimental designs that are static in a sense that they do not consider potential longitudinal effects of providing recommendations to users. In reality, however, various…

Information Retrieval · Computer Science 2021-08-26 Gediminas Adomavicius , Dietmar Jannach , Stephan Leitner , Jingjing Zhang

AgentBench: Evaluating LLMs as Agents

The potential of Large Language Model (LLM) as agents has been widely acknowledged recently. Thus, there is an urgent need to quantitatively \textit{evaluate LLMs as agents} on challenging tasks in interactive environments. We present…

Artificial Intelligence · Computer Science 2025-10-07 Xiao Liu , Hao Yu , Hanchen Zhang , Yifan Xu , Xuanyu Lei , Hanyu Lai , Yu Gu , Hangliang Ding , Kaiwen Men , Kejuan Yang , Shudan Zhang , Xiang Deng , Aohan Zeng , Zhengxiao Du , Chenhui Zhang , Sheng Shen , Tianjun Zhang , Yu Su , Huan Sun , Minlie Huang , Yuxiao Dong , Jie Tang

LMAgent: A Large-scale Multimodal Agents Society for Multi-user Simulation

The believable simulation of multi-user behavior is crucial for understanding complex social systems. Recently, large language models (LLMs)-based AI agents have made significant progress, enabling them to achieve human-like intelligence…

Artificial Intelligence · Computer Science 2024-12-16 Yijun Liu , Wu Liu , Xiaoyan Gu , Yong Rui , Xiaodong He , Yongdong Zhang

User Behavior Simulation with Large Language Model based Agents

Simulating high quality user behavior data has always been a fundamental problem in human-centered applications, where the major difficulty originates from the intricate mechanism of human decision process. Recently, substantial evidences…

Information Retrieval · Computer Science 2024-02-16 Lei Wang , Jingsen Zhang , Hao Yang , Zhiyuan Chen , Jiakai Tang , Zeyu Zhang , Xu Chen , Yankai Lin , Ruihua Song , Wayne Xin Zhao , Jun Xu , Zhicheng Dou , Jun Wang , Ji-Rong Wen

Accelerated learning from recommender systems using multi-armed bandit

Recommendation systems are a vital component of many online marketplaces, where there are often millions of items to potentially present to users who have a wide variety of wants or needs. Evaluating recommender system algorithms is a hard…

Information Retrieval · Computer Science 2019-08-20 Meisam Hejazinia , Kyler Eastman , Shuqin Ye , Abbas Amirabadi , Ravi Divvela

Beyond Static Evaluation: Rethinking the Assessment of Personalized Agent Adaptability in Information Retrieval

Personalized AI agents are becoming central to modern information retrieval, yet most evaluation methodologies remain static, relying on fixed benchmarks and one-off metrics that fail to reflect how users' needs evolve over time. These…

Information Retrieval · Computer Science 2025-10-07 Kirandeep Kaur , Preetam Prabhu Srikar Dammu , Hideo Joho , Chirag Shah