English
Related papers

Related papers: Value Function Decomposition in Markov Recommendat…

200 papers

In recommender systems, reinforcement learning solutions have shown promising results in optimizing the interaction sequence between users and the system over the long-term performance. For practical reasons, the policy's actions are…

Information Retrieval · Computer Science 2024-06-19 Xiaobei Wang , Shuchang Liu , Xueliang Wang , Qingpeng Cai , Lantao Hu , Han Li , Peng Jiang , Kun Gai , Guangming Xie

Value functions are crucial for model-free Reinforcement Learning (RL) to obtain a policy implicitly or guide the policy updates. Value estimation heavily depends on the stochasticity of environmental dynamics and the quality of reward…

Machine Learning · Computer Science 2019-05-28 Hongyao Tang , Jianye Hao , Guangyong Chen , Pengfei Chen , Zhaopeng Meng , Yaodong Yang , Li Wang

Designing reinforcement learning (RL) agents is typically a difficult process that requires numerous design iterations. Learning can fail for a multitude of reasons, and standard RL methods provide too few tools to provide insight into the…

Machine Learning · Computer Science 2022-10-24 James MacGlashan , Evan Archer , Alisa Devlic , Takuma Seno , Craig Sherstan , Peter R. Wurman , Peter Stone

We study the policy evaluation problem in multi-agent reinforcement learning, modeled by a Markov decision process. In this problem, the agents operate in a common environment under a fixed control policy, working together to discover the…

Optimization and Control · Mathematics 2020-01-13 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

In this paper we propose several novel distributed gradient-based temporal difference algorithms for multi-agent off-policy learning of linear approximation of the value function in Markov decision processes with strict information…

Machine Learning · Computer Science 2021-04-20 Milos S. Stankovic , Marko Beko , Srdjan S. Stankovic

Modeling user sequential behaviors has recently attracted increasing attention in the recommendation domain. Existing methods mostly assume coherent preference in the same sequence. However, user personalities are volatile and easily…

Information Retrieval · Computer Science 2022-04-01 Weiqi Shao , Xu Chen , Long Xia , Jiashu Zhao , Dawei Yin

Value decomposition has long been a fundamental technique in multi-agent dynamic programming and reinforcement learning (RL). Specifically, the value function of a global state $(s_1,s_2,\ldots,s_N)$ is often approximated as the sum of…

Machine Learning · Computer Science 2025-11-14 Shuze Chen , Tianyi Peng

Value function is the central notion of Reinforcement Learning (RL). Value estimation, especially with function approximation, can be challenging since it involves the stochasticity of environmental dynamics and reward signals that can be…

Machine Learning · Computer Science 2021-03-04 Hongyao Tang , Jianye Hao , Guangyong Chen , Pengfei Chen , Chen Chen , Yaodong Yang , Luo Zhang , Wulong Liu , Zhaopeng Meng

Distributional reinforcement learning (RL) is a powerful framework increasingly adopted in safety-critical domains for its ability to optimize risk-sensitive objectives. However, the role of the discount factor is often overlooked, as it is…

Machine Learning · Computer Science 2026-02-05 Mehrdad Moghimi , Anthony Coache , Hyejin Ku

This study addresses the challenges of dynamics and complexity in intelligent human-computer interaction and proposes a reinforcement learning-based optimization framework to improve long-term returns and overall experience. Human-computer…

Human-Computer Interaction · Computer Science 2025-11-03 Rui Liu , Yifan Zhuang , Runsheng Zhang

Recently, model-based agents have achieved better performance than model-free ones using the same computational budget and training time in single-agent environments. However, due to the complexity of multi-agent systems, it is tough to…

Multiagent Systems · Computer Science 2022-12-08 Zhiwei Xu , Dapeng Li , Bin Zhang , Yuan Zhan , Yunpeng Bai , Guoliang Fan

In classic reinforcement learning (RL) and decision making problems, policies are evaluated with respect to a scalar reward function, and all optimal policies are the same with regards to their expected return. However, many real-world…

Machine Learning · Computer Science 2023-11-02 Han Shao , Lee Cohen , Avrim Blum , Yishay Mansour , Aadirupa Saha , Matthew R. Walter

Auction-based recommender systems are prevalent in online advertising platforms, but they are typically optimized to allocate recommendation slots based on immediate expected return metrics, neglecting the downstream effects of…

Information Retrieval · Computer Science 2023-08-01 Ruiyang Xu , Jalaj Bhandari , Dmytro Korenkevych , Fan Liu , Yuchen He , Alex Nikulkov , Zheqing Zhu

Algorithmic decisions made by machine learning models in high-stakes domains may have lasting impacts over time. However, naive applications of standard fairness criterion in static settings over temporal domains may lead to delayed and…

Machine Learning · Computer Science 2022-03-01 Jianfeng Chi , Jian Shen , Xinyi Dai , Weinan Zhang , Yuan Tian , Han Zhao

We study the policy evaluation problem in multi-agent reinforcement learning. In this problem, a group of agents works cooperatively to evaluate the value function for the global discounted accumulative reward problem, which is composed of…

Optimization and Control · Mathematics 2019-06-04 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

In a wide variety of applications, humans interact with a complex environment by means of asynchronous stochastic discrete events in continuous time. Can we design online interventions that will help humans achieve certain goals in such…

Machine Learning · Computer Science 2018-11-07 Utkarsh Upadhyay , Abir De , Manuel Gomez-Rodriguez

We study policy evaluation problems in multi-task reinforcement learning (RL) under a low-rank representation setting. In this setting, we are given $N$ learning tasks where the corresponding value function of these tasks lie in an…

Machine Learning · Computer Science 2025-03-05 Yitao Bai , Sihan Zeng , Justin Romberg , Thinh T. Doan

In the physical world, people have dynamic preferences, e.g., the same situation can lead to satisfaction for some humans and to frustration for others. Personalization is called for. The same observation holds for online behavior with…

Information Retrieval · Computer Science 2017-08-16 Ziming Li , Julia Kiseleva , Maarten de Rijke , Artem Grotov

Although Reinforcement Learning (RL) algorithms have found tremendous success in simulated domains, they often cannot directly be applied to physical systems, especially in cases where there are hard constraints to satisfy (e.g. on safety…

Machine Learning · Computer Science 2020-08-28 Harsh Satija , Philip Amortila , Joelle Pineau

In this paper, we propose a novel model-based multi-agent reinforcement learning approach named Value Decomposition Framework with Disentangled World Model to address the challenge of achieving a common goal of multiple agents interacting…

Machine Learning · Computer Science 2025-09-29 Zhizun Wang , David Meger
‹ Prev 1 2 3 10 Next ›