English
Related papers

Related papers: Language Model Inversion

200 papers

Language model inversion seeks to recover hidden prompts using only language model outputs. This capability has implications for security and accountability in language model deployments, such as leaking private information from an…

Computation and Language · Computer Science 2025-12-12 Murtaza Nazir , Matthew Finlayson , John X. Morris , Xiang Ren , Swabha Swayamdipta

We explore a new language model inversion problem under strict black-box, zero-shot, and limited data conditions. We propose a novel training-free framework that reconstructs prompts using only a limited number of text outputs from a…

Computation and Language · Computer Science 2025-02-18 Hanqing Li , Diego Klabjan

Large language models (LLMs) have achieved remarkable progress across diverse tasks, yet their internal mechanisms remain largely opaque. In this work, we investigate a fundamental question: to what extent can the original input text be…

Computation and Language · Computer Science 2026-05-11 Haiyan Zhao , Zirui He , Yiming Tang , Fan Yang , Ali Payani , Dianbo Liu , Mengnan Du

Despite emerging research on Language Models (LM), few approaches analyse the invertibility of LMs. That is, given a LM and a desirable target output sequence of tokens, determining what input prompts would yield the target output remains…

Computation and Language · Computer Science 2026-02-12 Kevin Yandoka Denamganaï , Kartic Subr

The generative nature of Large Language Models (LLMs) is reflected in the conditional probabilities they compute to sample each response token given the previous tokens. These probabilities encode the distributional structure that the model…

Computation and Language · Computer Science 2026-05-22 Shilpika Shilpika , Carlo Graziani , Bethany Lusch , Venkatram Vishwanath , Michael E. Papka

We consider the problem of language model inversion: given outputs of a language model, we seek to extract the prompt that generated these outputs. We develop a new black-box method, output2prompt, that learns to extract prompts without…

Computation and Language · Computer Science 2024-10-10 Collin Zhang , John X. Morris , Vitaly Shmatikov

In-context learning is a recent paradigm in natural language understanding, where a large pre-trained language model (LM) observes a test instance and a few training examples as its input, and directly decodes the output without any update…

Computation and Language · Computer Science 2022-05-10 Ohad Rubin , Jonathan Herzig , Jonathan Berant

Language model inversion (LMI), i.e., recovering hidden prompts from outputs, emerges as a concrete threat to user privacy and system security. We recast LMI as reusing the LLM's own latent space and propose the Invariant Latent Space…

Machine Learning · Computer Science 2025-11-26 Wentao Ye , Jiaqi Hu , Haobo Wang , Xinpeng Ti , Zhiqing Xiao , Hao Chen , Liyao Li , Lei Feng , Sai Wu , Junbo Zhao

Autoregressive neural language models (LMs) generate a probability distribution over tokens at each time step given a prompt. In this work, we attempt to systematically understand the probability distributions that LMs can produce, showing…

Computation and Language · Computer Science 2025-09-23 Haojin Wang , Zining Zhu , Freda Shi

In this work, we show the pre-trained language models return distinguishable generation probability and uncertainty distribution to unfaithfully hallucinated texts, regardless of their size and structure. By examining 24 models on 6 data…

Computation and Language · Computer Science 2024-09-26 Taehun Cha , Donghun Lee

In this work, we observe an interesting phenomenon: it is possible to generate reversible sentence embeddings that allow an LLM to reconstruct the original text exactly, without modifying the model's weights. This is achieved by introducing…

Computation and Language · Computer Science 2026-01-09 Ignacio Sastre , Aiala Rosá

Large Language Models (LLMs) have shown promise in clinical applications through prompt engineering, allowing flexible clinical predictions. However, they struggle to produce reliable prediction probabilities, which are crucial for…

Artificial Intelligence · Computer Science 2024-12-05 Bowen Gu , Rishi J. Desai , Kueiyu Joshua Lin , Jie Yang

In recent years, Large Language Models (LLM) have emerged as pivotal tools in various applications. However, these models are susceptible to adversarial prompt attacks, where attackers can carefully curate input strings that mislead LLMs…

Computation and Language · Computer Science 2024-02-20 Zhengmian Hu , Gang Wu , Saayan Mitra , Ruiyi Zhang , Tong Sun , Heng Huang , Viswanathan Swaminathan

This article presents a general and flexible method for prompting a large language model (LLM) to reveal its (hidden) token input embedding up to homeomorphism. Moreover, this article provides strong theoretical justification -- a…

Differential Geometry · Mathematics 2025-03-20 Michael Robinson , Sourya Dey , Taisa Kushner

Estimating uncertainty in Large Language Models (LLMs) is important for properly evaluating LLMs, and ensuring safety for users. However, prior approaches to uncertainty estimation focus on the final answer in generated text, ignoring…

Computation and Language · Computer Science 2024-12-12 Eric Bigelow , Ari Holtzman , Hidenori Tanaka , Tomer Ullman

Language modeling has shifted in recent years from a distribution over strings to prediction models with textual inputs and outputs for general-purpose tasks. This position paper highlights the often overlooked implications of this shift…

Computation and Language · Computer Science 2026-05-13 Eitan Wagner , Omri Abend

Humans are accustomed to reading and writing in a forward manner, and this natural bias extends to text understanding in auto-regressive large language models (LLMs). This paper investigates whether LLMs, like humans, struggle with reverse…

Computation and Language · Computer Science 2025-02-25 Sicheng Yu , Yuanchen Xu , Cunxiao Du , Yanying Zhou , Minghui Qiu , Qianru Sun , Hao Zhang , Jiawei Wu

Prompt recovery, a crucial task in natural language processing, entails the reconstruction of prompts or instructions that language models use to convert input text into a specific output. Although pivotal, the design and effectiveness of…

Computation and Language · Computer Science 2024-07-09 Jianlong Chen , Wei Xu , Zhicheng Ding , Jinxin Xu , Hao Yan , Xinyu Zhang

Can autoregressive large language models (LLMs) learn consistent probability distributions when trained on sequences in different token orders? We prove formally that for any well-defined probability distribution, sequence perplexity is…

Computation and Language · Computer Science 2025-05-14 Xiaoliang Luo , Xinyi Xu , Michael Ramscar , Bradley C. Love

Large language models (LLMs) have been applied in various applications due to their astonishing capabilities. With advancements in technologies such as chain-of-thought (CoT) prompting and in-context learning (ICL), the prompts fed to LLMs…

Computation and Language · Computer Science 2023-12-07 Huiqiang Jiang , Qianhui Wu , Chin-Yew Lin , Yuqing Yang , Lili Qiu
‹ Prev 1 2 3 10 Next ›