Related papers: Reinforcement Learning Beyond Expectation

Privacy-Preserving Reinforcement Learning Beyond Expectation

Cyber and cyber-physical systems equipped with machine learning algorithms such as autonomous cars share environments with humans. In such a setting, it is important to align system (or agent) behaviors with the preferences of one or more…

Machine Learning · Computer Science 2022-03-22 Arezoo Rajabi , Bhaskar Ramasubramanian , Abdullah Al Maruf , Radha Poovendran

Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control

Cumulative prospect theory (CPT) is known to model human decisions well, with substantial empirical evidence supporting this claim. CPT works by distorting probabilities and is more general than the classic expected utility and coherent…

Machine Learning · Computer Science 2016-03-01 Prashanth L. A. , Cheng Jie , Michael Fu , Steve Marcus , Csaba Szepesvári

Interpretable Modelling of Driving Behaviors in Interactive Driving Scenarios based on Cumulative Prospect Theory

Understanding human driving behavior is important for autonomous vehicles. In this paper, we propose an interpretable human behavior model in interactive driving scenarios based on the cumulative prospect theory (CPT). As a non-expected…

Artificial Intelligence · Computer Science 2019-07-23 Liting Sun , Wei Zhan , Yeping Hu , Masayoshi Tomizuka

Reinforcement Learning in Economics and Finance

Reinforcement learning algorithms describe how an agent can learn an optimal action policy in a sequential decision process, through repeated experience. In a given environment, the agent policy provides him some running and terminal…

Theoretical Economics · Economics 2020-03-24 Arthur Charpentier , Romuald Elie , Carl Remlinger

Reinforcement learning for quantum processes with memory

In reinforcement learning, an agent interacts sequentially with an environment to maximize a reward, receiving only partial, probabilistic feedback. This creates a fundamental exploration-exploitation trade-off: the agent must explore to…

Quantum Physics · Physics 2026-03-27 Josep Lumbreras , Ruo Cheng Huang , Yanglin Hu , Marco Fanizza , Mile Gu

Policy Gradients for Cumulative Prospect Theory in Reinforcement Learning

We derive a policy gradient theorem for Cumulative Prospect Theory (CPT) objectives in finite-horizon Reinforcement Learning (RL), generalizing the standard policy gradient theorem and encompassing distortion-based risk objectives as…

Machine Learning · Computer Science 2026-02-18 Olivier Lepel , Anas Barakat

Goal-Oriented Semantic Resource Allocation with Cumulative Prospect Theoretic Agents

We introduce a resource allocation framework for goal-oriented semantic networks, where participating agents assess system quality through subjective (e.g., context-dependent) perceptions. To accommodate this, our model accounts for agents…

Information Theory · Computer Science 2025-06-06 Symeon Vaidanis , Photios A. Stavrou , Marios Kountouris

Interpretable Multi-Objective Reinforcement Learning through Policy Orchestration

Autonomous cyber-physical agents and systems play an increasingly large role in our lives. To ensure that agents behave in ways aligned with the values of the societies in which they operate, we must develop techniques that allow these…

Machine Learning · Computer Science 2018-09-25 Ritesh Noothigattu , Djallel Bouneffouf , Nicholas Mattei , Rachita Chandra , Piyush Madan , Kush Varshney , Murray Campbell , Moninder Singh , Francesca Rossi

Learning Diverse Risk Preferences in Population-based Self-play

Among the great successes of Reinforcement Learning (RL), self-play algorithms play an essential role in solving competitive games. Current self-play algorithms optimize the agent to maximize expected win-rates against its current or…

Machine Learning · Computer Science 2023-12-18 Yuhua Jiang , Qihan Liu , Xiaoteng Ma , Chenghao Li , Yiqin Yang , Jun Yang , Bin Liang , Qianchuan Zhao

Learning the Preferences of a Learning Agent

For AI systems to be useful to humans, they must understand and act in accordance with our values and preferences. Since specifying preferences is a hard task, inverse reinforcement learning (IRL) aims to develop methods that allow for…

Artificial Intelligence · Computer Science 2026-05-12 Karim Abdel Sadek , Mark Bedaywi , Rhys Gould , Stuart Russell

Assessing Human Interaction in Virtual Reality With Continually Learning Prediction Agents Based on Reinforcement Learning Algorithms: A Pilot Study

Artificial intelligence systems increasingly involve continual learning to enable flexibility in general situations that are not encountered during system training. Human interaction with autonomous systems is broadly studied, but research…

Artificial Intelligence · Computer Science 2022-04-25 Dylan J. A. Brenneis , Adam S. Parker , Michael Bradley Johanson , Andrew Butcher , Elnaz Davoodi , Leslie Acker , Matthew M. Botvinick , Joseph Modayil , Adam White , Patrick M. Pilarski

Adversarial Driving Behavior Generation Incorporating Human Risk Cognition for Autonomous Vehicle Evaluation

Autonomous vehicle (AV) evaluation has been the subject of increased interest in recent years both in industry and in academia. This paper focuses on the development of a novel framework for generating adversarial driving behavior of…

Artificial Intelligence · Computer Science 2023-10-17 Zhen Liu , Hang Gao , Hao Ma , Shuo Cai , Yunfeng Hu , Ting Qu , Hong Chen , Xun Gong

Policy-Based Reinforcement Learning for Assortative Matching in Human Behavior Modeling

This paper explores human behavior in virtual networked communities, specifically individuals or groups' potential and expressive capacity to respond to internal and external stimuli, with assortative matching as a typical example. A…

Multiagent Systems · Computer Science 2023-09-06 Ou Deng , Qun Jin

Towards Preference Learning for Autonomous Ground Robot Navigation Tasks

We are interested in the design of autonomous robot behaviors that learn the preferences of users over continued interactions, with the goal of efficiently executing navigation behaviors in a way that the user expects. In this paper, we…

Robotics · Computer Science 2020-11-06 Cory Hayes , Matthew Marge

Maximum Causal Entropy Inverse Constrained Reinforcement Learning

When deploying artificial agents in real-world environments where they interact with humans, it is crucial that their behavior is aligned with the values, social norms or other requirements of that environment. However, many environments…

Machine Learning · Computer Science 2023-05-05 Mattijs Baert , Pietro Mazzaglia , Sam Leroux , Pieter Simoens

Reinforcement Learning with Policy Mixture Model for Temporal Point Processes Clustering

Temporal point process is an expressive tool for modeling event sequences over time. In this paper, we take a reinforcement learning view whereby the observed sequences are assumed to be generated from a mixture of latent policies. The…

Machine Learning · Computer Science 2019-07-01 Weichang Wu , Junchi Yan , Xiaokang Yang , Hongyuan Zha

Constrained Exploration in Reinforcement Learning with Optimality Preservation

We consider a class of reinforcement-learning systems in which the agent follows a behavior policy to explore a discrete state-action space to find an optimal policy while adhering to some restriction on its behavior. Such restriction may…

Machine Learning · Computer Science 2023-04-07 Peter C. Y. Chen

Reinforcement Learning in Education: A Multi-Armed Bandit Approach

Advances in reinforcement learning research have demonstrated the ways in which different agent-based models can learn how to optimally perform a task within a given environment. Reinforcement leaning solves unsupervised problems where…

Machine Learning · Computer Science 2022-11-03 Herkulaas Combrink , Vukosi Marivate , Benjamin Rosman

What Should I Know? Using Meta-gradient Descent for Predictive Feature Discovery in a Single Stream of Experience

In computational reinforcement learning, a growing body of work seeks to construct an agent's perception of the world through predictions of future sensations; predictions about environment observations are used as additional input features…

Machine Learning · Computer Science 2022-06-15 Alexandra Kearney , Anna Koop , Johannes Günther , Patrick M. Pilarski

Constrained Policy Optimization

For many applications of reinforcement learning it can be more convenient to specify both a reward function and constraints, rather than trying to design behavior through the reward function. For example, systems that physically interact…

Machine Learning · Computer Science 2017-05-31 Joshua Achiam , David Held , Aviv Tamar , Pieter Abbeel