English
Related papers

Related papers: Artificial Intelligence, Values and Alignment

200 papers

This paper addresses the question of how to align AI systems with human values and situates it within a wider body of thought regarding technology and value. Far from existing in a vacuum, there has long been an interest in the ability of…

Computers and Society · Computer Science 2021-01-19 Iason Gabriel , Vafa Ghazavi

Solving the AI alignment problem requires having clear, defensible values towards which AI systems can align. Currently, targets for alignment remain underspecified and do not seem to be built from a philosophically robust structure. We…

Computers and Society · Computer Science 2023-11-29 Betty Li Hou , Brian Patrick Green

The field of AI alignment aims to steer AI systems toward human goals, preferences, and ethical principles. Its contributions have been instrumental for improving the output quality, safety, and trustworthiness of today's AI models. This…

Artificial Intelligence · Computer Science 2024-11-26 Robert West , Roland Aydin

Alignment of artificial intelligence (AI) encompasses the normative problem of specifying how AI systems should act and the technical problem of ensuring AI systems comply with those specifications. To date, AI alignment has generally…

In high-stakes AI-supported decisions, considerations are not purely technical but involve moral judgments about fairness, responsibility, and harm. While prior research has focused mainly on functional or behavioral alignment, this paper…

Human-Computer Interaction · Computer Science 2026-04-17 Christiane Ernst , Luis Gutmann , Domenique Zipperling , Kathrin Figl , Niklas Kühl

Background: Value alignment in computer science research is often used to refer to the process of aligning artificial intelligence with humans, but the way the phrase is used often lacks precision. Objectives: In this paper, we conduct a…

Computers and Society · Computer Science 2026-03-27 Jack McKinlay , Marina De Vos , Janina A. Hoffmann , Andreas Theodorou

AI alignment considers how we can encode AI systems in a way that is compatible with human values. The normative side of this problem asks what moral values or principles, if any, we should encode in AI. To this end, we present a framework…

Computers and Society · Computer Science 2023-01-11 Betty Li Hou , Brian Patrick Green

AI alignment aims to make AI systems behave in line with human intentions and values. As AI systems grow more capable, so do risks from misalignment. To provide a comprehensive and up-to-date overview of the alignment field, in this survey,…

Given that Artificial Intelligence (AI) increasingly permeates our lives, it is critical that we systematically align AI objectives with the goals and values of humans. The human-AI alignment problem stems from the impracticality of…

Computers and Society · Computer Science 2022-07-05 John Nay , James Daily

Discussion of AI alignment (alignment between humans and AI systems) has focused on value alignment, broadly referring to creating AI systems that share human values. We argue that before we can even attempt to align values, it is…

Machine Learning · Computer Science 2024-01-18 Sunayana Rane , Polyphony J. Bruna , Ilia Sucholutsky , Christopher Kello , Thomas L. Griffiths

The critical inquiry pervading the realm of Philosophy, and perhaps extending its influence across all Humanities disciplines, revolves around the intricacies of morality and normativity. Surprisingly, in recent years, this thematic thread…

Artificial Intelligence · Computer Science 2024-06-19 Nicholas Kluge Corrêa

With increasing digitalization, Artificial Intelligence (AI) is becoming ubiquitous. AI-based systems to identify, optimize, automate, and scale solutions to complex economic and societal problems are being proposed and implemented. This…

The value alignment problem for artificial intelligence (AI) is often framed as a purely technical or normative challenge, sometimes focused on hypothetical future systems. I argue that the problem is better understood as a structural…

Computers and Society · Computer Science 2026-04-23 Travis LaCroix

principles that should govern autonomous AI systems. It essentially states that a system's goals and behaviour should be aligned with human values. But how to ensure value alignment? In this paper we first provide a formal model to…

Artificial Intelligence · Computer Science 2024-02-08 Carles Sierra , Nardine Osman , Pablo Noriega , Jordi Sabater-Mir , Antoni Perelló

An important step in the development of value alignment (VA) systems in AI is understanding how values can interrelate with facts. Designers of future VA systems will need to utilize a hybrid approach in which ethical reasoning and…

Artificial Intelligence · Computer Science 2019-07-15 Tae Wan Kim , Thomas Donaldson , John Hooker

The AI-alignment problem arises when there is a discrepancy between the goals that a human designer specifies to an AI learner and a potential catastrophic outcome that does not reflect what the human designer really wants. We argue that a…

Machine Learning · Computer Science 2020-04-10 Shai Shalev-Shwartz , Shaked Shammah , Amnon Shashua

The project of aligning machine behavior with human values raises a basic problem: whose moral expectations should guide AI decision-making? Much alignment research assumes that the appropriate benchmark is how humans themselves would act…

Computers and Society · Computer Science 2026-05-13 Benjamin Minhao Chen , Xinyu Xie

A morally acceptable course of AI development should avoid two dangers: creating unaligned AI systems that pose a threat to humanity and mistreating AI systems that merit moral consideration in their own right. This paper argues these two…

Computers and Society · Computer Science 2025-10-16 Adam Bradley , Bradford Saad

Value alignment problems arise in scenarios where the specified objectives of an AI agent don't match the true underlying objective of its users. The problem has been widely argued to be one of the central safety problems in AI.…

Artificial Intelligence · Computer Science 2023-02-10 Malek Mechergui , Sarath Sreedharan

Artificial Intelligence (AI) is an effective science which employs strong enough approaches, methods, and techniques to solve unsolvable real world based problems. Because of its unstoppable rise towards the future, there are also some…

Artificial Intelligence · Computer Science 2017-06-12 Alice Pavaloiu , Utku Kose
‹ Prev 1 2 3 10 Next ›