English
Related papers

Related papers: Bergeron: Combating Adversarial Attacks through a …

200 papers

Recent studies on the safety alignment of large language models (LLMs) have revealed that existing approaches often operate superficially, leaving models vulnerable to various adversarial attacks. Despite their significance, these studies…

Cryptography and Security · Computer Science 2025-06-02 Jianwei Li , Jung-Eun Kim

The widespread adoption of Large Language Models (LLMs) has revolutionized AI deployment, enabling autonomous and semi-autonomous applications across industries through intuitive language interfaces and continuous improvements in model…

Cryptography and Security · Computer Science 2025-10-20 Adam Swanda , Amy Chang , Alexander Chen , Fraser Burch , Paul Kassianik , Konstantin Berlin

As powerful Large Language Models (LLMs) are now widely used for numerous practical applications, their safety is of critical importance. While alignment techniques have significantly improved overall safety, LLMs remain vulnerable to…

Machine Learning · Computer Science 2024-10-28 Samuel Jacob Chacko , Sajib Biswas , Chashi Mahiul Islam , Fatema Tabassum Liza , Xiuwen Liu

Large Language Models (LLMs) are swiftly advancing in architecture and capability, and as they integrate more deeply into complex systems, the urgency to scrutinize their security properties grows. This paper surveys research in the…

Computation and Language · Computer Science 2023-10-18 Erfan Shayegani , Md Abdullah Al Mamun , Yu Fu , Pedram Zaree , Yue Dong , Nael Abu-Ghazaleh

This position paper proposes a novel approach to advancing NLP security by leveraging Large Language Models (LLMs) as engines for generating diverse adversarial attacks. Building upon recent work demonstrating LLMs' effectiveness in…

Artificial Intelligence · Computer Science 2024-10-25 Sudarshan Srinivasan , Maria Mahbub , Amir Sadovnik

Large Language Models (LLMs) have transformed artificial intelligence by advancing natural language understanding and generation, enabling applications across fields beyond healthcare, software engineering, and conversational systems.…

Large Language Models (LLMs) have become a cornerstone in the field of Natural Language Processing (NLP), offering transformative capabilities in understanding and generating human-like text. However, with their rising prominence, the…

Cryptography and Security · Computer Science 2024-03-26 Arijit Ghosh Chowdhury , Md Mofijul Islam , Vaibhav Kumar , Faysal Hossain Shezan , Vaibhav Kumar , Vinija Jain , Aman Chadha

Robust verbal confidence generated by large language models (LLMs) is crucial for the deployment of LLMs to help ensure transparency, trust, and safety in many applications, including those involving human-AI interactions. In this paper, we…

Computation and Language · Computer Science 2025-12-19 Stephen Obadinma , Xiaodan Zhu

While vision and multimodal foundation models underpin critical tasks from perception to complex reasoning, they remain highly vulnerable to adversarial attacks. However, traditional adversarial attacks are typically limited to single,…

Cryptography and Security · Computer Science 2026-05-20 Ye Sun , Xin Wang , Jiaming Zhang , Yifeng Gao , Yixu Wang , Yifan Ding , Qixian Zhang , Henghui Ding , Xingjun Ma , Yu-Gang Jiang

Recently, Large Language Models (LLMs) have made significant advancements and are now widely used across various domains. Unfortunately, there has been a rising concern that LLMs can be misused to generate harmful or malicious content.…

Computation and Language · Computer Science 2024-06-13 Bochuan Cao , Yuanpu Cao , Lu Lin , Jinghui Chen

Large Language Models (LLMs) have demonstrated impressive capabilities in natural language tasks, but their safety and morality remain contentious due to their training on internet text corpora. To address these concerns, alignment…

Computation and Language · Computer Science 2024-08-06 Mohammad Bahrami Karkevandi , Nishant Vishwamitra , Peyman Najafirad

Large Language Models (LLMs) have become central to numerous natural language processing tasks, but their vulnerabilities present significant security and ethical challenges. This systematic survey explores the evolving landscape of attack…

Cryptography and Security · Computer Science 2025-05-05 Zhiyu Liao , Kang Chen , Yuanguo Lin , Kangkang Li , Yunxuan Liu , Hefeng Chen , Xingwang Huang , Yuanhui Yu

With the wide application of large language models (LLMs), the problems of bias and value inconsistency in sensitive domains have gradually emerged, especially in terms of race, society and politics. In this paper, we propose an adversarial…

Computation and Language · Computer Science 2026-01-23 Yuan Gao , Zhigang Liu , Xinyu Yao , Bo Chen , Xiaobing Zhao

Over the past decade, there has been extensive research aimed at enhancing the robustness of neural networks, yet this problem remains vastly unsolved. Here, one major impediment has been the overestimation of the robustness of new defense…

Artificial Intelligence · Computer Science 2023-10-31 Leo Schwinn , David Dobre , Stephan Günnemann , Gauthier Gidel

The progress of AI systems such as large language models (LLMs) raises increasingly pressing concerns about their safe deployment. This paper examines the value alignment problem for LLMs, arguing that current alignment strategies are…

Computation and Language · Computer Science 2025-06-06 Raphaël Millière

Large Language Models (LLMs) have revolutionized artificial intelligence and machine learning through their advanced text processing and generating capabilities. However, their widespread deployment has raised significant safety and…

Cryptography and Security · Computer Science 2024-12-03 Jing Cui , Yishi Xu , Zhewei Huang , Shuchang Zhou , Jianbin Jiao , Junge Zhang

Warning: This paper contains examples of harmful language, and reader discretion is recommended. The increasing open release of powerful large language models (LLMs) has facilitated the development of downstream applications by reducing the…

Computation and Language · Computer Science 2023-10-05 Xianjun Yang , Xiao Wang , Qi Zhang , Linda Petzold , William Yang Wang , Xun Zhao , Dahua Lin

This paper presents an approach to developing assurance cases for adversarial robustness and regulatory compliance in large language models (LLMs). Focusing on both natural and code language tasks, we explore the vulnerabilities these…

Cryptography and Security · Computer Science 2024-10-10 Tomas Bueno Momcilovic , Dian Balta , Beat Buesser , Giulio Zizzo , Mark Purcell

Large Language Models are fundamental actors in the modern IT landscape dominated by AI solutions. However, security threats associated with them might prevent their reliable adoption in critical application scenarios such as government…

Cryptography and Security · Computer Science 2025-11-10 Marco Arazzi , Vignesh Kumar Kembu , Antonino Nocera , Vinod P

Large language models (LLMs) are vulnerable when trained on datasets containing harmful content, which leads to potential jailbreaking attacks in two scenarios: the integration of harmful texts within crowdsourced data used for pre-training…

Cryptography and Security · Computer Science 2024-06-03 Xiaoqun Liu , Jiacheng Liang , Muchao Ye , Zhaohan Xi
‹ Prev 1 2 3 10 Next ›