Related papers: Function Vectors in Large Language Models

Causality $\neq$ Invariance: Function and Concept Vectors in LLMs

Do large language models (LLMs) represent concepts abstractly, i.e., independent of input format? We revisit Function Vectors (FVs), compact representations of in-context learning (ICL) tasks that causally drive task performance. Across…

Computation and Language · Computer Science 2026-02-27 Gustaw Opiełka , Hannes Rosenbusch , Claire E. Stevenson

How Few-Shot Examples Add Up: A Causal Decomposition of Function Vectors in In-Context Learning

In-context learning (ICL) excels at new tasks from minimal examples, yet we still lack a mechanistic explanation of how few-shot prompts shape a model's function vector (FV)--a causal activation direction that drives task behavior on the…

Machine Learning · Computer Science 2026-05-26 Entang Wang , Yiwei Wang , Aleksandra Bakalova , Michael Hahn

Do different prompting methods yield a common task representation in language models?

Demonstrations and instructions are two primary approaches for prompting language models to perform in-context learning (ICL) tasks. Do identical tasks elicited in different ways result in similar representations of the task? An improved…

Computation and Language · Computer Science 2025-12-02 Guy Davidson , Todd M. Gureckis , Brenden M. Lake , Adina Williams

Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of Abstraction

Analogical reasoning relies on conceptual abstractions, but it is unclear whether Large Language Models (LLMs) harbor such internal representations. We explore distilled representations from LLM activations and find that function vectors…

Computation and Language · Computer Science 2025-03-06 Gustaw Opiełka , Hannes Rosenbusch , Claire E. Stevenson

Exploring Language-Agnosticity in Function Vectors: A Case Study in Machine Translation

Function vectors (FVs) are vector representations of tasks extracted from model activations during in-context learning. While prior work has shown that multilingual model representations can be language-agnostic, it remains unclear whether…

Computation and Language · Computer Science 2026-04-22 Nurkhan Laiyk , Gerard I. Gállego , Javier Ferrando , Fajri Koto

Label Words as Local Task Vectors in In-Context Learning

Large Language Models (LLMs) have demonstrated remarkable abilities, one of the most important being in-context learning (ICL). With ICL, LLMs can derive the underlying rule from a few demonstrations and provide answers that comply with the…

Computation and Language · Computer Science 2025-12-23 Bowen Zheng , Ming Ma , Zhongqiao Lin , Tianming Yang

In-Context Learning Creates Task Vectors

In-context learning (ICL) in Large Language Models (LLMs) has emerged as a powerful new learning paradigm. However, its underlying mechanism is still not well understood. In particular, it is challenging to map it to the "standard" machine…

Computation and Language · Computer Science 2023-10-25 Roee Hendel , Mor Geva , Amir Globerson

Task Vectors, Learned Not Extracted: Performance Gains and Mechanistic Insight

Large Language Models (LLMs) can perform new tasks from in-context demonstrations, a phenomenon known as in-context learning (ICL). Recent work suggests that these demonstrations are compressed into task vectors (TVs), compact task…

Computation and Language · Computer Science 2026-05-04 Haolin Yang , Hakaze Cho , Kaize Ding , Naoya Inoue

Learning Task Representations from In-Context Learning

Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning (ICL), where models adapt to new tasks through example-based prompts without requiring parameter updates. However, understanding how tasks are…

Computation and Language · Computer Science 2025-11-11 Baturay Saglam , Xinyang Hu , Zhuoran Yang , Dionysis Kalogerias , Amin Karbasi

Which Attention Heads Matter for In-Context Learning?

Large language models (LLMs) exhibit impressive in-context learning (ICL) capability, enabling them to perform new tasks using only a few demonstrations in the prompt. Two different mechanisms have been proposed to explain ICL: induction…

Machine Learning · Computer Science 2025-05-05 Kayo Yin , Jacob Steinhardt

Understanding Task Vectors in In-Context Learning: Emergence, Functionality, and Limitations

Task vectors offer a compelling mechanism for accelerating inference in in-context learning (ICL) by distilling task-specific information into a single, reusable representation. Despite their empirical success, the underlying principles…

Machine Learning · Computer Science 2025-06-11 Yuxin Dong , Jiachen Jiang , Zhihui Zhu , Xia Ning

Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning

Catastrophic forgetting (CF) poses a significant challenge in machine learning, where a model forgets previously learned information upon learning new tasks. Despite the advanced capabilities of Large Language Models (LLMs), they continue…

Machine Learning · Computer Science 2025-04-17 Gangwei Jiang , Caigao Jiang , Zhaoyi Li , Siqiao Xue , Jun Zhou , Linqi Song , Defu Lian , Ying Wei

Vision-Language Models Create Cross-Modal Task Representations

Autoregressive vision-language models (VLMs) can handle many tasks within a single model, yet the representations that enable this capability remain opaque. We find that VLMs align conceptually equivalent inputs into a shared task vector,…

Computer Vision and Pattern Recognition · Computer Science 2025-05-08 Grace Luo , Trevor Darrell , Amir Bar

Relational Knowledge Distillation Using Fine-tuned Function Vectors

Representing relations between concepts is a core prerequisite for intelligent systems to make sense of the world. Recent work using causal mediation analysis has shown that a small set of attention heads encodes task representation in…

Computation and Language · Computer Science 2026-01-14 Andrea Kang , Yingnian Wu , Hongjing Lu

Multimodal Function Vectors for Spatial Relations

Large Multimodal Models (LMMs) demonstrate impressive in-context learning abilities from limited multimodal demonstrations, yet the internal mechanisms supporting such task learning remain opaque. Building on prior work of large language…

Artificial Intelligence · Computer Science 2025-10-06 Shuhao Fu , Esther Goldberg , Ying Nian Wu , Hongjing Lu

In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering

Large language models (LLMs) demonstrate emergent in-context learning capabilities, where they adapt to new tasks based on example demonstrations. However, in-context learning has seen limited effectiveness in many settings, is difficult to…

Machine Learning · Computer Science 2024-02-15 Sheng Liu , Haotian Ye , Lei Xing , James Zou

Concept Attractors in LLMs and their Applications

Large language models (LLMs) often map semantically related prompts to similar internal representations at specific layers, even when their surface forms differ widely. We show that this behavior can be explained through Iterated Function…

Computation and Language · Computer Science 2026-01-21 Sotirios Panagiotis Chytas , Vikas Singh

Functional Abstraction of Knowledge Recall in Large Language Models

Pre-trained transformer large language models (LLMs) demonstrate strong knowledge recall capabilities. This paper investigates the knowledge recall mechanism in LLMs by abstracting it into a functional structure. We propose that during…

Computation and Language · Computer Science 2025-04-22 Zijian Wang , Chang Xu

Functional Subspace, where language models can use vector algebra to solve problems

Large language models (LLMs) were invented for natural language tasks such as translation, but they have proved that they can perform highly complex functions across domains. Additionally, they have been thought to develop new skills…

Computation and Language · Computer Science 2026-05-12 Jung H. Lee , Sujith Vijayan

Provable In-Context Vector Arithmetic via Retrieving Task Concepts

In-context learning (ICL) has garnered significant attention for its ability to grasp functions/tasks from demonstrations. Recent studies suggest the presence of a latent task/function vector in LLMs during ICL. Merullo et al. (2024) showed…

Machine Learning · Computer Science 2025-08-14 Dake Bu , Wei Huang , Andi Han , Atsushi Nitanda , Qingfu Zhang , Hau-San Wong , Taiji Suzuki