English
Related papers

Related papers: Aligned: A Platform-based Process for Alignment

200 papers

Alignment of artificial intelligence (AI) encompasses the normative problem of specifying how AI systems should act and the technical problem of ensuring AI systems comply with those specifications. To date, AI alignment has generally…

Existing alignment research is dominated by concerns about safety and preventing harm: safeguards, controllability, and compliance. This paradigm of alignment parallels early psychology's focus on mental illness: necessary but incomplete.…

AI alignment aims to make AI systems behave in line with human intentions and values. As AI systems grow more capable, so do risks from misalignment. To provide a comprehensive and up-to-date overview of the alignment field, in this survey,…

The issues of AI risk and AI safety are becoming critical as the prospect of artificial general intelligence (AGI) looms larger. The emergence of extremely large and capable generative models has led to alarming predictions and created a…

Artificial Intelligence · Computer Science 2025-05-20 Ali A. Minai

As AI adoption expands across human society, the problem of aligning AI models to match human preferences remains a grand challenge. Currently, the AI alignment field is deeply divided between behavioral and representational approaches,…

Computers and Society · Computer Science 2025-08-12 Ben Y. Reis , William La Cava

Recent advances in AI research make it increasingly plausible that artificial agents with consequential real-world impact will soon operate beyond tightly controlled environments. Ensuring that these agents are not only safe but that they…

Computers and Society · Computer Science 2025-06-10 Kevin Baum

This paper explores the potential of a multidisciplinary approach to testing and aligning artificial intelligence (AI), specifically focusing on large language models (LLMs). Due to the rapid development and wide application of LLMs,…

Computers and Society · Computer Science 2025-01-07 Ljubisa Bojic , Matteo Cinelli , Dubravko Culibrk , Boris Delibasic

As artificial intelligence scales, the concepts of alignment, agency, and autonomy have become central to AI safety, governance, and control. However, even in human contexts, these terms lack universal definitions, varying across…

Computers and Society · Computer Science 2025-03-11 Krti Tallam

The emergence of large language models (LLMs) has sparked the possibility of about Artificial Superintelligence (ASI), a hypothetical AI system surpassing human intelligence. However, existing alignment paradigms struggle to guide such…

Machine Learning · Computer Science 2024-12-30 HyunJin Kim , Xiaoyuan Yi , Jing Yao , Jianxun Lian , Muhua Huang , Shitong Duan , JinYeong Bak , Xing Xie

Biological and artificial information processing systems form representations of the world that they can use to categorize, reason, plan, navigate, and make decisions. How can we measure the similarity between the representations formed by…

As artificial intelligence (AI) systems become increasingly integral to critical infrastructure and global operations, the need for a unified, trustworthy governance framework is more urgent that ever. This paper proposes a novel approach…

Artificial Intelligence · Computer Science 2025-01-17 Vikram Kulothungan

This position paper argues that effectively "democratizing AI" requires democratic governance and alignment of AI, and that this is particularly valuable for decisions with systemic societal impacts. Initial steps -- such as Meta's…

This year, jurisdictions worldwide, including the United States, the European Union, the United Kingdom, and China, are set to enact or revise laws governing frontier AI. Their efforts largely rely on the assumption that increasing model…

Computers and Society · Computer Science 2025-02-25 Nicholas A. Caputo

While much research in artificial intelligence (AI) has focused on scaling capabilities, the accelerating pace of development makes countervailing work on producing harmless, "aligned" systems increasingly urgent. Yet research on alignment…

Artificial Intelligence · Computer Science 2025-12-12 Dani Roytburg , Beck Miller

International institutions may have an important role to play in ensuring advanced AI systems benefit humanity. International collaborations can unlock AI's ability to further sustainable development, and coordination of regulatory efforts…

Purpose: The governance of artificial iintelligence (AI) systems requires a structured approach that connects high-level regulatory principles with practical implementation. Existing frameworks lack clarity on how regulations translate into…

Computers and Society · Computer Science 2025-09-16 Avinash Agarwal , Manisha J. Nene

This paper offers a roadmap for the development of scalable aligned artificial intelligence (AI) from first principle descriptions of natural intelligence. In brief, a possible path toward scalable aligned AI rests upon enabling artificial…

This paper contributes to the nascent debate around safety cases for frontier AI systems. Safety cases are structured, defensible arguments that a system is acceptably safe to deploy in a given context. Historically, they have been used in…

Computers and Society · Computer Science 2026-03-11 Shaun Feakins , Ibrahim Habli , Phillip Morgan

Jurisprudence, the study of how judges should properly decide cases, and alignment, the science of getting AI models to conform to human values, share a fundamental structure. These seemingly distant fields both seek to predict and shape…

Artificial Intelligence · Computer Science 2026-05-12 Nicholas Caputo

This position paper argues that formal optimal control theory should be central to AI alignment research, offering a distinct perspective from prevailing AI safety and security approaches. While recent work in AI safety and mechanistic…

Artificial Intelligence · Computer Science 2025-06-24 Elija Perrier
‹ Prev 1 2 3 10 Next ›