English
Related papers

Related papers: Concept Alignment

200 papers

Value alignment is essential for building AI systems that can safely and reliably interact with people. However, what a person values -- and is even capable of valuing -- depends on the concepts that they are currently using to understand…

Artificial Intelligence · Computer Science 2023-11-01 Sunayana Rane , Mark Ho , Ilia Sucholutsky , Thomas L. Griffiths

Background: Value alignment in computer science research is often used to refer to the process of aligning artificial intelligence with humans, but the way the phrase is used often lacks precision. Objectives: In this paper, we conduct a…

Computers and Society · Computer Science 2026-03-27 Jack McKinlay , Marina De Vos , Janina A. Hoffmann , Andreas Theodorou

This paper addresses the question of how to align AI systems with human values and situates it within a wider body of thought regarding technology and value. Far from existing in a vacuum, there has long been an interest in the ability of…

Computers and Society · Computer Science 2021-01-19 Iason Gabriel , Vafa Ghazavi

The concepts of ``human-centered AI'' and ``value-based decision'' have gained significant attention in both research and industry. However, many critical aspects remain underexplored and require further investigation. In particular, there…

Artificial Intelligence · Computer Science 2025-08-26 Sz-Ting Tzeng , Frank Dignum

This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive…

Computers and Society · Computer Science 2020-10-07 Iason Gabriel

Recent advances in general-purpose AI underscore the urgent need to align AI systems with human goals and values. Yet, the lack of a clear, shared understanding of what constitutes "alignment" limits meaningful progress and…

AI alignment aims to make AI systems behave in line with human intentions and values. As AI systems grow more capable, so do risks from misalignment. To provide a comprehensive and up-to-date overview of the alignment field, in this survey,…

AI alignment is about ensuring AI systems only pursue goals and activities that are beneficial to humans. Most of the current approach to AI alignment is to learn what humans value from their behavioural data. This paper proposes a…

Artificial Intelligence · Computer Science 2023-10-06 Pei-Yu Chen , Myrthe L. Tielman , Dirk K. J. Heylen , Catholijn M. Jonker , M. Birna van Riemsdijk

One of today's most significant societal challenges is building AI systems whose behaviour, or the behaviour it enables within communities of interacting agents (human and artificial), aligns with human values. To address this challenge, we…

Artificial Intelligence · Computer Science 2026-02-09 Nardine Osman , Mark d'Inverno

AI alignment considers how we can encode AI systems in a way that is compatible with human values. The normative side of this problem asks what moral values or principles, if any, we should encode in AI. To this end, we present a framework…

Computers and Society · Computer Science 2023-01-11 Betty Li Hou , Brian Patrick Green

The field of AI alignment aims to steer AI systems toward human goals, preferences, and ethical principles. Its contributions have been instrumental for improving the output quality, safety, and trustworthiness of today's AI models. This…

Artificial Intelligence · Computer Science 2024-11-26 Robert West , Roland Aydin

Recent advances in AI -- including generative approaches -- have resulted in technology that can support humans in scientific discovery and forming decisions, but may also disrupt democracies and target individuals. The responsible use of…

Given that Artificial Intelligence (AI) increasingly permeates our lives, it is critical that we systematically align AI objectives with the goals and values of humans. The human-AI alignment problem stems from the impracticality of…

Computers and Society · Computer Science 2022-07-05 John Nay , James Daily

principles that should govern autonomous AI systems. It essentially states that a system's goals and behaviour should be aligned with human values. But how to ensure value alignment? In this paper we first provide a formal model to…

Artificial Intelligence · Computer Science 2024-02-08 Carles Sierra , Nardine Osman , Pablo Noriega , Jordi Sabater-Mir , Antoni Perelló

Minimizing negative impacts of Artificial Intelligent (AI) systems on human societies without human supervision requires them to be able to align with human values. However, most current work only addresses this issue from a technical point…

Computation and Language · Computer Science 2024-08-13 Mehdi Khamassi , Marceau Nahon , Raja Chatila

The value-alignment problem for artificial intelligence (AI) asks how we can ensure that the 'values' (i.e., objective functions) of artificial systems are aligned with the values of humanity. In this paper, I argue that linguistic…

Artificial Intelligence · Computer Science 2022-07-05 Travis LaCroix

We propose the creation of a systematic effort to identify and replicate key findings in neuropsychology and allied fields related to understanding human values. Our aim is to ensure that research underpinning the value alignment problem of…

Artificial Intelligence · Computer Science 2018-09-11 Gopal P. Sarma , Nick J. Hay , Adam Safron

Value alignment problems arise in scenarios where the specified objectives of an AI agent don't match the true underlying objective of its users. The problem has been widely argued to be one of the central safety problems in AI.…

Artificial Intelligence · Computer Science 2023-02-10 Malek Mechergui , Sarath Sreedharan

As general-purpose artificial intelligence (AI) systems become increasingly integrated with diverse human communities, cultural alignment has emerged as a crucial element in their deployment. Most existing approaches treat cultural…

Artificial Intelligence · Computer Science 2025-03-11 Michal Bravansky , Filip Trhlik , Fazl Barez

AI intent alignment, ensuring that AI produces outcomes as intended by users, is a critical challenge in human-AI interaction. The emergence of generative AI, including LLMs, has intensified the significance of this problem, as interactions…

Human-Computer Interaction · Computer Science 2024-06-21 Yoonsu Kim , Kihoon Son , Seoyoung Kim , Juho Kim
‹ Prev 1 2 3 10 Next ›