English
Related papers

Related papers: Safety without alignment

200 papers

Existing alignment research is dominated by concerns about safety and preventing harm: safeguards, controllability, and compliance. This paradigm of alignment parallels early psychology's focus on mental illness: necessary but incomplete.…

Ttraditional safety engineering is coming to a turning point moving from deterministic, non-evolving systems operating in well-defined contexts to increasingly autonomous and learning-enabled AI systems which are acting in largely…

Artificial Intelligence · Computer Science 2022-05-13 Harald Rueß , Simon Burton

In this position paper, we address the persistent gap between rapidly growing AI capabilities and lagging safety progress. Existing paradigms divide into ``Make AI Safe'', which applies post-hoc alignment and guardrails but remains brittle…

Machine Learning · Computer Science 2025-09-09 Youbang Sun , Xiang Wang , Jie Fu , Chaochao Lu , Bowen Zhou

Last decade has seen major improvements in the performance of artificial intelligence which has driven wide-spread applications. Unforeseen effects of such mass-adoption has put the notion of AI safety into the public eye. AI safety is a…

Computers and Society · Computer Science 2020-07-10 Mislav Juric , Agneza Sandic , Mario Brcic

While much research in artificial intelligence (AI) has focused on scaling capabilities, the accelerating pace of development makes countervailing work on producing harmless, "aligned" systems increasingly urgent. Yet research on alignment…

Artificial Intelligence · Computer Science 2025-12-12 Dani Roytburg , Beck Miller

Artificial intelligence (AI) is interacting with people at an unprecedented scale, offering new avenues for immense positive impact, but also raising widespread concerns around the potential for individual and societal harm. Today, the…

Artificial Intelligence · Computer Science 2024-06-25 Andrea Bajcsy , Jaime F. Fisac

As artificial intelligence (AI) becomes deeply integrated into critical infrastructures and everyday life, ensuring its safe deployment is one of humanity's most urgent challenges. Current AI models prioritize task optimization over safety,…

Artificial Intelligence · Computer Science 2024-11-08 Joshua T. S. Hewson

Artificial Intelligence (AI) is an effective science which employs strong enough approaches, methods, and techniques to solve unsolvable real world based problems. Because of its unstoppable rise towards the future, there are also some…

Artificial Intelligence · Computer Science 2017-06-12 Alice Pavaloiu , Utku Kose

The issues of AI risk and AI safety are becoming critical as the prospect of artificial general intelligence (AGI) looms larger. The emergence of extremely large and capable generative models has led to alarming predictions and created a…

Artificial Intelligence · Computer Science 2025-05-20 Ali A. Minai

This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive…

Computers and Society · Computer Science 2020-10-07 Iason Gabriel

This paper addresses the question of how to align AI systems with human values and situates it within a wider body of thought regarding technology and value. Far from existing in a vacuum, there has long been an interest in the ability of…

Computers and Society · Computer Science 2021-01-19 Iason Gabriel , Vafa Ghazavi

As AI technologies increase in capability and ubiquity, AI accidents are becoming more common. Based on normal accident theory, high reliability theory, and open systems theory, we create a framework for understanding the risks associated…

Computers and Society · Computer Science 2024-03-13 Heather M. Williams , Roman V. Yampolskiy

Recent discussions and research in AI safety have increasingly emphasized the deep connection between AI safety and existential risk from advanced AI systems, suggesting that work on AI safety necessarily entails serious consideration of…

Computers and Society · Computer Science 2025-02-17 Balint Gyevnar , Atoosa Kasirzadeh

Increasing interest in ensuring the safety of next-generation Artificial Intelligence (AI) systems calls for novel approaches to embedding morality into autonomous agents. This goal differs qualitatively from traditional task-specific AI…

Artificial Intelligence · Computer Science 2025-01-17 Elizaveta Tennant , Stephen Hailes , Mirco Musolesi

Artificial Intelligence (AI) has rapidly evolved over the past decade and has advanced in areas such as language comprehension, image and video recognition, programming, and scientific reasoning. Recent AI technologies based on large…

Machine Learning · Computer Science 2024-10-30 Jonghong Jeon

While artificial intelligence (AI) is advancing rapidly and mastering increasingly complex problems with astonishing performance, the safety assurance of such systems is a major concern. Particularly in the context of safety-critical,…

Artificial Intelligence · Computer Science 2025-07-01 Lars Ullrich , Walter Zimmer , Ross Greer , Knut Graichen , Alois C. Knoll , Mohan Trivedi

Embodied AI systems, comprising AI models and physical plants, are increasingly prevalent across various applications. Due to the rarity of system failures, ensuring their safety in complex operating environments remains a major challenge,…

AI safety is still largely framed as alignment: training models to follow human preferences, safety policies, and normative constraints. That framing has improved the behavior of modern language models, but aligned behavior does not by…

Artificial Intelligence · Computer Science 2026-05-27 Yige Li , Yunhao Feng , Jun Sun

As AI systems become more advanced, companies and regulators will make difficult decisions about whether it is safe to train and deploy them. To prepare for these decisions, we investigate how developers could make a 'safety case,' which is…

Computers and Society · Computer Science 2024-03-20 Joshua Clymer , Nick Gabrieli , David Krueger , Thomas Larsen

We present our Balanced, Integrated and Grounded (BIG) argument for assuring the safety of AI systems. The BIG argument adopts a whole-system approach to constructing a safety case for AI systems of varying capability, autonomy and…

Computers and Society · Computer Science 2025-04-01 Ibrahim Habli , Richard Hawkins , Colin Paterson , Philippa Ryan , Yan Jia , Mark Sujan , John McDermid
‹ Prev 1 2 3 10 Next ›