English
Related papers

Related papers: Function Vectors in Large Language Models

200 papers

Do large language models (LLMs) represent concepts abstractly, i.e., independent of input format? We revisit Function Vectors (FVs), compact representations of in-context learning (ICL) tasks that causally drive task performance. Across…

Computation and Language · Computer Science 2026-02-27 Gustaw Opiełka , Hannes Rosenbusch , Claire E. Stevenson

In-context learning (ICL) excels at new tasks from minimal examples, yet we still lack a mechanistic explanation of how few-shot prompts shape a model's function vector (FV)--a causal activation direction that drives task behavior on the…

Machine Learning · Computer Science 2026-05-26 Entang Wang , Yiwei Wang , Aleksandra Bakalova , Michael Hahn

Demonstrations and instructions are two primary approaches for prompting language models to perform in-context learning (ICL) tasks. Do identical tasks elicited in different ways result in similar representations of the task? An improved…

Computation and Language · Computer Science 2025-12-02 Guy Davidson , Todd M. Gureckis , Brenden M. Lake , Adina Williams

Analogical reasoning relies on conceptual abstractions, but it is unclear whether Large Language Models (LLMs) harbor such internal representations. We explore distilled representations from LLM activations and find that function vectors…

Computation and Language · Computer Science 2025-03-06 Gustaw Opiełka , Hannes Rosenbusch , Claire E. Stevenson

Function vectors (FVs) are vector representations of tasks extracted from model activations during in-context learning. While prior work has shown that multilingual model representations can be language-agnostic, it remains unclear whether…

Computation and Language · Computer Science 2026-04-22 Nurkhan Laiyk , Gerard I. Gállego , Javier Ferrando , Fajri Koto

Large Language Models (LLMs) have demonstrated remarkable abilities, one of the most important being in-context learning (ICL). With ICL, LLMs can derive the underlying rule from a few demonstrations and provide answers that comply with the…

Computation and Language · Computer Science 2025-12-23 Bowen Zheng , Ming Ma , Zhongqiao Lin , Tianming Yang

In-context learning (ICL) in Large Language Models (LLMs) has emerged as a powerful new learning paradigm. However, its underlying mechanism is still not well understood. In particular, it is challenging to map it to the "standard" machine…

Computation and Language · Computer Science 2023-10-25 Roee Hendel , Mor Geva , Amir Globerson

Large Language Models (LLMs) can perform new tasks from in-context demonstrations, a phenomenon known as in-context learning (ICL). Recent work suggests that these demonstrations are compressed into task vectors (TVs), compact task…

Computation and Language · Computer Science 2026-05-04 Haolin Yang , Hakaze Cho , Kaize Ding , Naoya Inoue

Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning (ICL), where models adapt to new tasks through example-based prompts without requiring parameter updates. However, understanding how tasks are…

Computation and Language · Computer Science 2025-11-11 Baturay Saglam , Xinyang Hu , Zhuoran Yang , Dionysis Kalogerias , Amin Karbasi

Large language models (LLMs) exhibit impressive in-context learning (ICL) capability, enabling them to perform new tasks using only a few demonstrations in the prompt. Two different mechanisms have been proposed to explain ICL: induction…

Machine Learning · Computer Science 2025-05-05 Kayo Yin , Jacob Steinhardt

Task vectors offer a compelling mechanism for accelerating inference in in-context learning (ICL) by distilling task-specific information into a single, reusable representation. Despite their empirical success, the underlying principles…

Machine Learning · Computer Science 2025-06-11 Yuxin Dong , Jiachen Jiang , Zhihui Zhu , Xia Ning

Catastrophic forgetting (CF) poses a significant challenge in machine learning, where a model forgets previously learned information upon learning new tasks. Despite the advanced capabilities of Large Language Models (LLMs), they continue…

Machine Learning · Computer Science 2025-04-17 Gangwei Jiang , Caigao Jiang , Zhaoyi Li , Siqiao Xue , Jun Zhou , Linqi Song , Defu Lian , Ying Wei

Autoregressive vision-language models (VLMs) can handle many tasks within a single model, yet the representations that enable this capability remain opaque. We find that VLMs align conceptually equivalent inputs into a shared task vector,…

Computer Vision and Pattern Recognition · Computer Science 2025-05-08 Grace Luo , Trevor Darrell , Amir Bar

Representing relations between concepts is a core prerequisite for intelligent systems to make sense of the world. Recent work using causal mediation analysis has shown that a small set of attention heads encodes task representation in…

Computation and Language · Computer Science 2026-01-14 Andrea Kang , Yingnian Wu , Hongjing Lu

Large Multimodal Models (LMMs) demonstrate impressive in-context learning abilities from limited multimodal demonstrations, yet the internal mechanisms supporting such task learning remain opaque. Building on prior work of large language…

Artificial Intelligence · Computer Science 2025-10-06 Shuhao Fu , Esther Goldberg , Ying Nian Wu , Hongjing Lu

Large language models (LLMs) demonstrate emergent in-context learning capabilities, where they adapt to new tasks based on example demonstrations. However, in-context learning has seen limited effectiveness in many settings, is difficult to…

Machine Learning · Computer Science 2024-02-15 Sheng Liu , Haotian Ye , Lei Xing , James Zou

Large language models (LLMs) often map semantically related prompts to similar internal representations at specific layers, even when their surface forms differ widely. We show that this behavior can be explained through Iterated Function…

Computation and Language · Computer Science 2026-01-21 Sotirios Panagiotis Chytas , Vikas Singh

Pre-trained transformer large language models (LLMs) demonstrate strong knowledge recall capabilities. This paper investigates the knowledge recall mechanism in LLMs by abstracting it into a functional structure. We propose that during…

Computation and Language · Computer Science 2025-04-22 Zijian Wang , Chang Xu

Large language models (LLMs) were invented for natural language tasks such as translation, but they have proved that they can perform highly complex functions across domains. Additionally, they have been thought to develop new skills…

Computation and Language · Computer Science 2026-05-12 Jung H. Lee , Sujith Vijayan

In-context learning (ICL) has garnered significant attention for its ability to grasp functions/tasks from demonstrations. Recent studies suggest the presence of a latent task/function vector in LLMs during ICL. Merullo et al. (2024) showed…

Machine Learning · Computer Science 2025-08-14 Dake Bu , Wei Huang , Andi Han , Atsushi Nitanda , Qingfu Zhang , Hau-San Wong , Taiji Suzuki
‹ Prev 1 2 3 10 Next ›