Related papers: Remote Timing Attacks on Efficient Language Model …

Trading Inference-Time Compute for Adversarial Robustness

We conduct experiments on the impact of increasing inference-time compute in reasoning models (specifically OpenAI o1-preview and o1-mini) on their robustness to adversarial attacks. We find that across a variety of attacks, increased…

Machine Learning · Computer Science 2025-02-03 Wojciech Zaremba , Evgenia Nitishinskaya , Boaz Barak , Stephanie Lin , Sam Toyer , Yaodong Yu , Rachel Dias , Eric Wallace , Kai Xiao , Johannes Heidecke , Amelia Glaese

Adversarial Vulnerabilities in Large Language Models for Time Series Forecasting

Large Language Models (LLMs) have recently demonstrated significant potential in time series forecasting, offering impressive capabilities in handling complex temporal data. However, their robustness and reliability in real-world…

Machine Learning · Computer Science 2025-03-14 Fuqiang Liu , Sicong Jiang , Luis Miranda-Moreno , Seongjin Choi , Lijun Sun

The Use of Large Language Models (LLM) for Cyber Threat Intelligence (CTI) in Cybercrime Forums

Large language models (LLMs) can be used to analyze cyber threat intelligence (CTI) data from cybercrime forums, which contain extensive information and key discussions about emerging cyber threats. However, to date, the level of accuracy…

Cryptography and Security · Computer Science 2024-10-02 Vanessa Clairoux-Trepanier , Isa-May Beauchamp , Estelle Ruellan , Masarah Paquet-Clouston , Serge-Olivier Paquette , Eric Clay

Auditing Prompt Caching in Language Model APIs

Prompt caching in large language models (LLMs) results in data-dependent timing variations: cached prompts are processed faster than non-cached prompts. These timing differences introduce the risk of side-channel timing attacks. For…

Computation and Language · Computer Science 2025-07-15 Chenchen Gu , Xiang Lisa Li , Rohith Kuditipudi , Percy Liang , Tatsunori Hashimoto

Scaling Trends in Language Model Robustness

Increasing model size has unlocked a dazzling array of capabilities in modern language models. At the same time, even frontier models remain vulnerable to jailbreaks and prompt injections, despite concerted efforts to make them robust. As…

Machine Learning · Computer Science 2025-06-06 Nikolaus Howe , Ian McKenzie , Oskar Hollinsworth , Michał Zajac , Tom Tseng , Aaron Tucker , Pierre-Luc Bacon , Adam Gleave

Robustifying Language Models with Test-Time Adaptation

Large-scale language models achieved state-of-the-art performance over a number of language tasks. However, they fail on adversarial language examples, which are sentences optimized to fool the language models but with similar semantic…

Computation and Language · Computer Science 2023-10-31 Noah Thomas McDermott , Junfeng Yang , Chengzhi Mao

Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models

This paper demonstrates a new side-channel that enables an adversary to extract sensitive information about inference inputs in large language models (LLMs) based on the number of output tokens in the LLM response. We construct attacks…

Machine Learning · Computer Science 2024-12-23 Tianchen Zhang , Gururaj Saileshwar , David Lie

Whisper Leak: a side-channel attack on Large Language Models

Large Language Models (LLMs) are increasingly deployed in sensitive domains including healthcare, legal services, and confidential communications, where privacy is paramount. This paper introduces Whisper Leak, a side-channel attack that…

Cryptography and Security · Computer Science 2025-11-06 Geoff McDonald , Jonathan Bar Or

Misusing Tools in Large Language Models With Visual Adversarial Examples

Large Language Models (LLMs) are being enhanced with the ability to use tools and to process multiple modalities. These new capabilities bring new benefits and also new security risks. In this work, we show that an attacker can use visual…

Cryptography and Security · Computer Science 2023-10-06 Xiaohan Fu , Zihan Wang , Shuheng Li , Rajesh K. Gupta , Niloofar Mireshghallah , Taylor Berg-Kirkpatrick , Earlence Fernandes

Defending Large Language Models Against Attacks With Residual Stream Activation Analysis

The widespread adoption of Large Language Models (LLMs), exemplified by OpenAI's ChatGPT, brings to the forefront the imperative to defend against adversarial threats on these models. These attacks, which manipulate an LLM's output by…

Cryptography and Security · Computer Science 2025-04-04 Amelia Kawasaki , Andrew Davis , Houssam Abbas

InputSnatch: Stealing Input in LLM Services via Timing Side-Channel Attacks

Large language models (LLMs) possess extensive knowledge and question-answering capabilities, having been widely deployed in privacy-sensitive domains like finance and medical consultation. During LLM inferences, cache-sharing methods are…

Cryptography and Security · Computer Science 2024-12-02 Xinyao Zheng , Husheng Han , Shangyi Shi , Qiyan Fang , Zidong Du , Xing Hu , Qi Guo

When Human cognitive modeling meets PINs: User-independent inter-keystroke timing attacks

This paper proposes the first user-independent inter-keystroke timing attacks on PINs. Our attack method is based on an inter-keystroke timing dictionary built from a human cognitive model whose parameters can be determined by a small…

Cryptography and Security · Computer Science 2018-10-18 Ximing Liu , Yingjiu Li , Robert H. Deng , Bing Chang , Shujun Li

Backdoor Attacks for In-Context Learning with Language Models

Because state-of-the-art language models are expensive to train, most practitioners must make use of one of the few publicly available language models or language model APIs. This consolidation of trust increases the potency of backdoor…

Cryptography and Security · Computer Science 2023-07-28 Nikhil Kandpal , Matthew Jagielski , Florian Tramèr , Nicholas Carlini

UltraFeedback: Boosting Language Models with Scaled AI Feedback

Learning from human feedback has become a pivot technique in aligning large language models (LLMs) with human preferences. However, acquiring vast and premium human feedback is bottlenecked by time, labor, and human capability, resulting in…

Computation and Language · Computer Science 2024-07-17 Ganqu Cui , Lifan Yuan , Ning Ding , Guanming Yao , Bingxiang He , Wei Zhu , Yuan Ni , Guotong Xie , Ruobing Xie , Yankai Lin , Zhiyuan Liu , Maosong Sun

Extracting Training Data from Large Language Models

It has become common to publish large (billion parameter) language models that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a training data extraction attack to recover…

Cryptography and Security · Computer Science 2021-06-16 Nicholas Carlini , Florian Tramer , Eric Wallace , Matthew Jagielski , Ariel Herbert-Voss , Katherine Lee , Adam Roberts , Tom Brown , Dawn Song , Ulfar Erlingsson , Alina Oprea , Colin Raffel

Inferring Events from Time Series using Language Models

Time series data measure how environments change over time and drive decision-making in critical domains like finance and healthcare. A common goal in analyzing time series data is to understand the underlying events that cause the observed…

Artificial Intelligence · Computer Science 2025-05-26 Mingtian Tan , Mike A. Merrill , Zack Gottesman , Tim Althoff , David Evans , Tom Hartvigsen

Integrating Large Language Models with Internet of Things Applications

This paper identifies and analyzes applications in which Large Language Models (LLMs) can make Internet of Things (IoT) networks more intelligent and responsive through three case studies from critical topics: DDoS attack detection,…

Artificial Intelligence · Computer Science 2024-10-28 Mingyu Zong , Arvin Hekmati , Michael Guastalla , Yiyi Li , Bhaskar Krishnamachari

Sampling-aware Adversarial Attacks Against Large Language Models

To guarantee safe and robust deployment of large language models (LLMs) at scale, it is critical to accurately assess their adversarial robustness. Existing adversarial attacks typically target harmful responses in single-point greedy…

Machine Learning · Computer Science 2026-02-24 Tim Beyer , Yan Scholten , Leo Schwinn , Stephan Günnemann

LLMs Have Rhythm: Fingerprinting Large Language Models Using Inter-Token Times and Network Traffic Analysis

As Large Language Models (LLMs) become increasingly integrated into many technological ecosystems across various domains and industries, identifying which model is deployed or being interacted with is critical for the security and…

Cryptography and Security · Computer Science 2025-07-09 Saeif Alhazbi , Ahmed Mohamed Hussain , Gabriele Oligeri , Panos Papadimitratos

Temporal Attack Pattern Detection in Multi-Agent AI Workflows: An Open Framework for Training Trace-Based Security Models

We present an openly documented methodology for fine-tuning language models to detect temporal attack patterns in multi-agent AI workflows using OpenTelemetry trace analysis. We curate a dataset of 80,851 examples from 18 public…

Artificial Intelligence · Computer Science 2026-01-06 Ron F. Del Rosario