Related papers: Value Alignment Equilibrium in Multiagent Systems

Value alignment: a formal approach

principles that should govern autonomous AI systems. It essentially states that a system's goals and behaviour should be aligned with human values. But how to ensure value alignment? In this paper we first provide a formal model to…

Artificial Intelligence · Computer Science 2024-02-08 Carles Sierra , Nardine Osman , Pablo Noriega , Jordi Sabater-Mir , Antoni Perelló

Rethinking How AI Embeds and Adapts to Human Values: Challenges and Opportunities

The concepts of ``human-centered AI'' and ``value-based decision'' have gained significant attention in both research and industry. However, many critical aspects remain underexplored and require further investigation. In particular, there…

Artificial Intelligence · Computer Science 2025-08-26 Sz-Ting Tzeng , Frank Dignum

Multi-Value Alignment in Normative Multi-Agent System: An Evolutionary Optimisation Approach

Value-alignment in normative multi-agent systems is used to promote a certain value and to ensure the consistent behaviour of agents in autonomous intelligent systems with human values. However, the current literature is limited to the…

Multiagent Systems · Computer Science 2023-10-13 Maha Riad , Vinicius de Carvalho , Fatemeh Golpayegani

Understanding the Process of Human-AI Value Alignment

Background: Value alignment in computer science research is often used to refer to the process of aligning artificial intelligence with humans, but the way the phrase is used often lacks precision. Objectives: In this paper, we conduct a…

Computers and Society · Computer Science 2026-03-27 Jack McKinlay , Marina De Vos , Janina A. Hoffmann , Andreas Theodorou

Measuring Value Alignment

As artificial intelligence (AI) systems become increasingly integrated into various domains, ensuring that they align with human values becomes critical. This paper introduces a novel formalism to quantify the alignment between AI systems…

Artificial Intelligence · Computer Science 2023-12-27 Fazl Barez , Philip Torr

Multi-Value Alignment in Normative Multi-Agent System: Evolutionary Optimisation Approach

Value-alignment in normative multi-agent systems is used to promote a certain value and to ensure the consistent behavior of agents in autonomous intelligent systems with human values. However, the current literature is limited to…

Multiagent Systems · Computer Science 2023-05-15 Maha Riad , Vinicius Renan de Carvalho , Fatemeh Golpayegani

Multi-level Value Alignment in Agentic AI Systems: Survey and Perspectives

The ongoing evolution of AI paradigms has propelled AI research into the agentic AI stage. Consequently, the focus of research has shifted from single agents and simple applications towards multi-agent autonomous decision-making and task…

Artificial Intelligence · Computer Science 2025-08-08 Wei Zeng , Hengshu Zhu , Chuan Qin , Han Wu , Yihang Cheng , Sirui Zhang , Xiaowei Jin , Yinuo Shen , Zhenxing Wang , Feimin Zhong , Hui Xiong

Concept Alignment as a Prerequisite for Value Alignment

Value alignment is essential for building AI systems that can safely and reliably interact with people. However, what a person values -- and is even capable of valuing -- depends on the concepts that they are currently using to understand…

Artificial Intelligence · Computer Science 2023-11-01 Sunayana Rane , Mark Ho , Ilia Sucholutsky , Thomas L. Griffiths

Being Considerate as a Pathway Towards Pluralistic Alignment for Agentic AI

Pluralistic alignment is concerned with ensuring that an AI system's objectives and behaviors are in harmony with the diversity of human values and perspectives. In this paper we study the notion of pluralistic alignment in the context of…

Artificial Intelligence · Computer Science 2024-11-19 Parand A. Alamdari , Toryn Q. Klassen , Rodrigo Toro Icarte , Sheila A. McIlraith

Relative Principals, Pluralistic Alignment, and the Structural Value Alignment Problem

The value alignment problem for artificial intelligence (AI) is often framed as a purely technical or normative challenge, sometimes focused on hypothetical future systems. I argue that the problem is better understood as a structural…

Computers and Society · Computer Science 2026-04-23 Travis LaCroix

Artificial Intelligence, Values and Alignment

This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive…

Computers and Society · Computer Science 2020-10-07 Iason Gabriel

Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior

As more machine learning agents interact with humans, it is increasingly a prospect that an agent trained to perform a task optimally, using only a measure of task performance as feedback, can violate societal norms for acceptable behavior…

Machine Learning · Computer Science 2021-04-20 Md Sultan Al Nahian , Spencer Frazier , Brent Harrison , Mark Riedl

Pragmatic-Pedagogic Value Alignment

As intelligent systems gain autonomy and capability, it becomes vital to ensure that their objectives match those of their human users; this is known as the value-alignment problem. In robotics, value alignment is key to the design of…

Artificial Intelligence · Computer Science 2018-02-07 Jaime F. Fisac , Monica A. Gates , Jessica B. Hamrick , Chang Liu , Dylan Hadfield-Menell , Malayandi Palaniappan , Dhruv Malik , S. Shankar Sastry , Thomas L. Griffiths , Anca D. Dragan

Can an AI Agent Safely Run a Government? Existence of Probably Approximately Aligned Policies

While autonomous agents often surpass humans in their ability to handle vast and complex data, their potential misalignment (i.e., lack of transparency regarding their true objective) has thus far hindered their use in critical applications…

Artificial Intelligence · Computer Science 2024-12-03 Frédéric Berdoz , Roger Wattenhofer

The Challenge of Value Alignment: from Fairer Algorithms to AI Safety

This paper addresses the question of how to align AI systems with human values and situates it within a wider body of thought regarding technology and value. Far from existing in a vacuum, there has long been an interest in the ability of…

Computers and Society · Computer Science 2021-01-19 Iason Gabriel , Vafa Ghazavi

The Coming Crisis of Multi-Agent Misalignment: AI Alignment Must Be a Dynamic and Social Process

This position paper states that AI Alignment in Multi-Agent Systems (MAS) should be considered a dynamic and interaction-dependent process that heavily depends on the social environment where agents are deployed, either collaborative,…

Artificial Intelligence · Computer Science 2025-06-09 Florian Carichon , Aditi Khandelwal , Marylou Fauchard , Golnoosh Farnadi

Concept Alignment

Discussion of AI alignment (alignment between humans and AI systems) has focused on value alignment, broadly referring to creating AI systems that share human values. We argue that before we can even attempt to align values, it is…

Machine Learning · Computer Science 2024-01-18 Sunayana Rane , Polyphony J. Bruna , Ilia Sucholutsky , Christopher Kello , Thomas L. Griffiths

Advantage Alignment Algorithms

Artificially intelligent agents are increasingly being integrated into human decision-making: from large language model (LLM) assistants to autonomous vehicles. These systems often optimize their individual objective, leading to conflicts,…

Machine Learning · Computer Science 2025-02-07 Juan Agustin Duque , Milad Aghajohari , Tim Cooijmans , Razvan Ciuca , Tianyu Zhang , Gauthier Gidel , Aaron Courville

Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value

Beneficial societal outcomes cannot be guaranteed by aligning individual AI systems with the intentions of their operators or users. Even an AI system that is perfectly aligned to the intentions of its operating organization can lead to bad…

Machine Learning · Computer Science 2025-12-04 Joe Edelman , Tan Zhi-Xuan , Ryan Lowe , Oliver Klingefjord , Vincent Wang-Mascianica , Matija Franklin , Ryan Othniel Kearns , Ellie Hain , Atrisha Sarkar , Michiel Bakker , Fazl Barez , David Duvenaud , Jakob Foerster , Iason Gabriel , Joseph Gubbels , Bryce Goodman , Andreas Haupt , Jobst Heitzig , Julian Jara-Ettinger , Atoosa Kasirzadeh , James Ravi Kirkpatrick , Andrew Koh , W. Bradley Knox , Philipp Koralus , Joel Lehman , Sydney Levine , Samuele Marro , Manon Revel , Toby Shorin , Morgan Sutherland , Michael Henry Tessler , Ivan Vendrov , James Wilken-Smith

Ethics2vec: aligning automatic agents and human preferences

Though intelligent agents are supposed to improve human experience (or make it more efficient), it is hard from a human perspective to grasp the ethical values which are explicitly or implicitly embedded in an agent behaviour. This is the…

Artificial Intelligence · Computer Science 2025-08-12 Gianluca Bontempi