English
Related papers

Related papers: The AGI Containment Problem

200 papers

The development of Artificial General Intelligence (AGI) promises to be a major event. Along with its many potential benefits, it also raises serious safety concerns (Bostrom, 2014). The intention of this paper is to provide an easily…

Artificial Intelligence · Computer Science 2018-05-22 Tom Everitt , Gary Lea , Marcus Hutter

With almost daily improvements in capabilities of artificial intelligence it is more important than ever to develop safety software for use by the AI research community. Building on our previous work on AI Containment Problem we propose a…

Artificial Intelligence · Computer Science 2017-07-27 James Babcock , Janos Kramar , Roman V. Yampolskiy

Awareness of the possible impacts associated with artificial intelligence has risen in proportion to progress in the field. While there are tremendous benefits to society, many argue that there are just as many, if not more, concerns…

Artificial Intelligence · Computer Science 2021-08-03 Jason M. Pittman , Jesus P. Espinoza , Courtney Crosby

Software Testing is a well-established area in software engineering, encompassing various techniques and methodologies to ensure the quality and reliability of software systems. However, with the advent of generative artificial intelligence…

Software Engineering · Computer Science 2023-09-18 Aldeida Aleti

Artificial General Intelligence (AGI) promises transformative benefits but also presents significant risks. We develop an approach to address the risk of harms consequential enough to significantly harm humanity. We identify four areas of…

A number of leading AI companies, including OpenAI, Google DeepMind, and Anthropic, have the stated goal of building artificial general intelligence (AGI) - AI systems that achieve or exceed human performance across a wide range of…

Computers and Society · Computer Science 2023-05-15 Jonas Schuett , Noemi Dreksler , Markus Anderljung , David McCaffary , Lennart Heim , Emma Bluemke , Ben Garfinkel

Artificial Intelligence (AI) achieved super-human performance in a broad variety of domains. We say that an AI is made Artificially Stupid on a task when some limitations are deliberately introduced to match a human's ability to do the…

Artificial Intelligence · Computer Science 2018-08-14 Michaël Trazzi , Roman V. Yampolskiy

In coming years or decades, artificial general intelligence (AGI) may surpass human capabilities across many critical domains. We argue that, without substantial effort to prevent it, AGIs could learn to pursue goals that are in conflict…

Artificial Intelligence · Computer Science 2025-05-06 Richard Ngo , Lawrence Chan , Sören Mindermann

This chapter presents perspectives for challenges and future development in building reliable AI systems, particularly, agentic AI systems. Several open research problems related to mitigating the risks of cascading failures are discussed.…

Artificial Intelligence · Computer Science 2025-11-18 Liudong Xing , Janet , Lin

From early days, a key and controversial question inside the artificial intelligence community was whether Artificial General Intelligence (AGI) is achievable. AGI is the ability of machines and computer programs to achieve human-level…

Artificial Intelligence · Computer Science 2022-09-15 Mostafa Haghir Chehreghani

Artificial General Intelligence (AGI) is increasingly being discussed not only as a tool, but also as a potential subject with personal and therefore moral status. In our opinion, the currently dominant alignment strategies, which focus on…

Artificial Intelligence · Computer Science 2026-04-17 Till Mossakowski , Helena Esther Grass

Artificial intelligence (AI) has been advancing at a fast pace and it is now poised for deployment in a wide range of applications, such as autonomous systems, medical diagnosis and natural language processing. Early adoption of AI…

Machine Learning · Computer Science 2023-09-21 Marta Kwiatkowska , Xiyue Zhang

Corrigibility is a safety property for artificially intelligent agents. A corrigible agent will not resist attempts by authorized parties to alter the goals and constraints that were encoded in the agent when it was first started. This…

Artificial Intelligence · Computer Science 2020-04-06 Koen Holtman

A core challenge in the development of increasingly capable AI systems is to make them safe and reliable by ensuring their behaviour is consistent with human values. This challenge, known as the alignment problem, does not merely apply to…

Machine Learning · Computer Science 2023-11-07 Raphaël Millière

The rapid advancement of artificial intelligence has positioned data governance as a critical concern for responsible AI development. While frameworks exist for conventional AI systems, the potential emergence of Artificial General…

Computers and Society · Computer Science 2025-08-19 Masayuki Hatta

AI Generated Content (AIGC) has received tremendous attention within the past few years, with content generated in the format of image, text, audio, video, etc. Meanwhile, AIGC has become a double-edged sword and recently received much…

Artificial Intelligence · Computer Science 2023-12-29 Chen Chen , Jie Fu , Lingjuan Lyu

Rapid progress in machine learning and artificial intelligence (AI) has brought increasing attention to the potential impacts of AI technologies on society. In this paper we discuss one such potential impact: the problem of accidents in…

Artificial Intelligence · Computer Science 2016-07-26 Dario Amodei , Chris Olah , Jacob Steinhardt , Paul Christiano , John Schulman , Dan Mané

Recent AI progress has outpaced expectations, with some experts now predicting AI that matches or exceeds human capabilities in all cognitive areas (AGI) could emerge this decade, potentially posing grave national and global security…

Computers and Society · Computer Science 2025-07-30 Sarah Hastings-Woodhouse

The issues of AI risk and AI safety are becoming critical as the prospect of artificial general intelligence (AGI) looms larger. The emergence of extremely large and capable generative models has led to alarming predictions and created a…

Artificial Intelligence · Computer Science 2025-05-20 Ali A. Minai

We describe a path to humanity safely thriving with powerful Artificial General Intelligences (AGIs) by building them to provably satisfy human-specified requirements. We argue that this will soon be technically feasible using advanced AI…

Computers and Society · Computer Science 2023-09-06 Max Tegmark , Steve Omohundro
‹ Prev 1 2 3 10 Next ›