English
Related papers

Related papers: The Alignment Problem from a Deep Learning Perspec…

200 papers

Artificial General Intelligence (AGI) is increasingly being discussed not only as a tool, but also as a potential subject with personal and therefore moral status. In our opinion, the currently dominant alignment strategies, which focus on…

Artificial Intelligence · Computer Science 2026-04-17 Till Mossakowski , Helena Esther Grass

The issues of AI risk and AI safety are becoming critical as the prospect of artificial general intelligence (AGI) looms larger. The emergence of extremely large and capable generative models has led to alarming predictions and created a…

Artificial Intelligence · Computer Science 2025-05-20 Ali A. Minai

The AI alignment problem, which focusses on ensuring that artificial intelligence (AI), including AGI and ASI, systems act according to human values, presents profound challenges. With the progression from narrow AI to Artificial General…

Artificial Intelligence · Computer Science 2025-07-25 Alberto Hernández-Espinosa , Felipe S. Abrahão , Olaf Witkowski , Hector Zenil

The recent leap in AI capabilities, driven by big generative models, has sparked the possibility of achieving Artificial General Intelligence (AGI) and further triggered discussions on Artificial Superintelligence (ASI)-a system surpassing…

Artificial Intelligence · Computer Science 2026-02-10 HyunJin Kim , Xiaoyuan Yi , Jing Yao , Muhua Huang , JinYeong Bak , James Evans , Xing Xie

From early days, a key and controversial question inside the artificial intelligence community was whether Artificial General Intelligence (AGI) is achievable. AGI is the ability of machines and computer programs to achieve human-level…

Artificial Intelligence · Computer Science 2022-09-15 Mostafa Haghir Chehreghani

Creating systems that are aligned with our goals is seen as a leading approach to create safe and beneficial AI in both leading AI companies and the academic field of AI safety. We defend the view that misaligned AGI - future, generally…

Computers and Society · Computer Science 2025-06-05 Max Hellrigel-Holderbaum , Leonard Dung

Artificial General Intelligence (AGI) promises transformative benefits but also presents significant risks. We develop an approach to address the risk of harms consequential enough to significantly harm humanity. We identify four areas of…

The field of AI alignment aims to steer AI systems toward human goals, preferences, and ethical principles. Its contributions have been instrumental for improving the output quality, safety, and trustworthiness of today's AI models. This…

Artificial Intelligence · Computer Science 2024-11-26 Robert West , Roland Aydin

A leading proposal for aligning artificial superintelligence (ASI) is to use AI agents to automate an increasing fraction of alignment research as capabilities improve. We argue that, even when research agents are not scheming to…

Artificial Intelligence · Computer Science 2026-05-18 Aleksandr Bowkis , Marie Davidsen Buhl , Jacob Pfau , Geoffrey Irving

The AI-alignment problem arises when there is a discrepancy between the goals that a human designer specifies to an AI learner and a potential catastrophic outcome that does not reflect what the human designer really wants. We argue that a…

Machine Learning · Computer Science 2020-04-10 Shai Shalev-Shwartz , Shaked Shammah , Amnon Shashua

General intelligence, the ability to solve arbitrary solvable problems, is supposed by many to be artificially constructible. Narrow intelligence, the ability to solve a given particularly difficult problem, has seen impressive recent…

Artificial Intelligence · Computer Science 2020-07-22 Michael K Cohen , Badri Vellambi , Marcus Hutter

The field of AI alignment is concerned with AI systems that pursue unintended goals. One commonly studied mechanism by which an unintended goal might arise is specification gaming, in which the designer-provided specification is flawed in a…

Machine Learning · Computer Science 2022-11-03 Rohin Shah , Vikrant Varma , Ramana Kumar , Mary Phuong , Victoria Krakovna , Jonathan Uesato , Zac Kenton

Artificial General Intelligence (AGI) has been a long-standing goal of humanity, with the aim of creating machines capable of performing any intellectual task that humans can do. To achieve this, AGI researchers draw inspiration from the…

Artificial Intelligence · Computer Science 2023-03-29 Lin Zhao , Lu Zhang , Zihao Wu , Yuzhong Chen , Haixing Dai , Xiaowei Yu , Zhengliang Liu , Tuo Zhang , Xintao Hu , Xi Jiang , Xiang Li , Dajiang Zhu , Dinggang Shen , Tianming Liu

We conduct an incentivized laboratory experiment to study people's perception of generative artificial intelligence (GenAI) alignment in the context of economic decision-making. Using a panel of economic problems spanning the domains of…

Theoretical Economics · Economics 2026-04-03 Kevin He , Ran Shorrer , Mengjia Xia

Artificial general intelligence (AGI) does not yet exist, but given the pace of technological development in artificial intelligence, it is projected to reach human-level intelligence within roughly the next two decades. After that, many…

Computers and Society · Computer Science 2023-11-16 David R. Mandel

The original vision of AI was re-articulated in 2002 via the term 'Artificial General Intelligence' or AGI. This vision is to build 'Thinking Machines' - computer systems that can learn, reason, and solve problems similar to the way humans…

Artificial Intelligence · Computer Science 2023-09-20 Peter Voss , Mladjan Jovanovic

In recent years, deep learning using neural network architecture, i.e. deep neural networks, has been on the frontier of computer science research. It has even lead to superhuman performance in some problems, e.g., in computer vision, games…

Machine Learning · Computer Science 2022-04-07 Maciej Świechowski

A core challenge in the development of increasingly capable AI systems is to make them safe and reliable by ensuring their behaviour is consistent with human values. This challenge, known as the alignment problem, does not merely apply to…

Machine Learning · Computer Science 2023-11-07 Raphaël Millière

As AI adoption expands across human society, the problem of aligning AI models to match human preferences remains a grand challenge. Currently, the AI alignment field is deeply divided between behavioral and representational approaches,…

Computers and Society · Computer Science 2025-08-12 Ben Y. Reis , William La Cava

Artificial intelligence (AI) is advancing exponentially and is likely to have profound impacts on human wellbeing, social equity, and environmental sustainability. Here we argue that the "alignment problem" in AI research is also an…

General Economics · Economics 2026-04-30 Daniel W. O'Neill , Stefano Vrizzi , Noemi Luna Carmeno , Felix Creutzig , Jefim Vogel
‹ Prev 1 2 3 10 Next ›