Related papers: Learning Task Representations from In-Context Lear…

In-Context Learning Creates Task Vectors

In-context learning (ICL) in Large Language Models (LLMs) has emerged as a powerful new learning paradigm. However, its underlying mechanism is still not well understood. In particular, it is challenging to map it to the "standard" machine…

Computation and Language · Computer Science 2023-10-25 Roee Hendel , Mor Geva , Amir Globerson

Re-examining learning linear functions in context

In-context learning (ICL) has emerged as a powerful paradigm for easily adapting Large Language Models (LLMs) to various tasks. However, our understanding of how ICL works remains limited. We explore a simple model of ICL in a controlled…

Machine Learning · Computer Science 2025-09-03 Omar Naim , Guilhem Fouilhé , Nicholas Asher

Label Words as Local Task Vectors in In-Context Learning

Large Language Models (LLMs) have demonstrated remarkable abilities, one of the most important being in-context learning (ICL). With ICL, LLMs can derive the underlying rule from a few demonstrations and provide answers that comply with the…

Computation and Language · Computer Science 2025-12-23 Bowen Zheng , Ming Ma , Zhongqiao Lin , Tianming Yang

Emergence and Effectiveness of Task Vectors in In-Context Learning: An Encoder Decoder Perspective

Autoregressive transformers exhibit adaptive learning through in-context learning (ICL), which begs the question of how. Prior works have shown that transformers represent the ICL tasks as vectors in their representations. In this paper, we…

Computation and Language · Computer Science 2025-06-03 Seungwook Han , Jinyeop Song , Jeff Gore , Pulkit Agrawal

In-Context Learning with Representations: Contextual Generalization of Trained Transformers

In-context learning (ICL) refers to a remarkable capability of pretrained large language models, which can learn a new task given a few examples during inference. However, theoretical understanding of ICL is largely under-explored,…

Machine Learning · Computer Science 2024-09-27 Tong Yang , Yu Huang , Yingbin Liang , Yuejie Chi

Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis

We investigate the mechanistic underpinnings of in-context learning (ICL) in large language models by reconciling two dominant perspectives: the component-level analysis of attention heads and the holistic decomposition of ICL into Task…

Computation and Language · Computer Science 2026-05-04 Haolin Yang , Hakaze Cho , Naoya Inoue

Understanding Generalization and Forgetting in In-Context Continual Learning

In-context learning (ICL) derives its power from enabling Large Language Models to adapt to new tasks via prompt-based reasoning alone, entirely bypassing the need for parameter updates. Existing theories primarily study ICL in single-task…

Machine Learning · Computer Science 2026-05-28 Guangyu Li , Meng Ding , Lijie Hu

Task Vectors, Learned Not Extracted: Performance Gains and Mechanistic Insight

Large Language Models (LLMs) can perform new tasks from in-context demonstrations, a phenomenon known as in-context learning (ICL). Recent work suggests that these demonstrations are compressed into task vectors (TVs), compact task…

Computation and Language · Computer Science 2026-05-04 Haolin Yang , Hakaze Cho , Kaize Ding , Naoya Inoue

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks

We study the phenomenon of \textit{in-context learning} (ICL) exhibited by large language models, where they can adapt to a new learning task, given a handful of labeled examples, without any explicit parameter optimization. Our goal is to…

Machine Learning · Computer Science 2023-05-29 Jacob Abernethy , Alekh Agarwal , Teodor V. Marinov , Manfred K. Warmuth

Large Language Models Know What Makes Exemplary Contexts

In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs). By instructing LLMs using few-shot demonstrative examples, ICL enables them to perform a wide range of tasks without…

Computation and Language · Computer Science 2024-08-21 Quanyu Long , Jianda Chen , Wenya Wang , Sinno Jialin Pan

What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning

Large language models (LLMs) exploit in-context learning (ICL) to solve tasks with only a few demonstrations, but its mechanisms are not yet well-understood. Some works suggest that LLMs only recall already learned concepts from…

Computation and Language · Computer Science 2023-05-18 Jane Pan , Tianyu Gao , Howard Chen , Danqi Chen

From Compression to Expression: A Layerwise Analysis of In-Context Learning

In-context learning (ICL) enables large language models (LLMs) to adapt to new tasks without weight updates by learning from demonstration sequences. While ICL shows strong empirical performance, its internal representational mechanisms are…

Computation and Language · Computer Science 2025-10-07 Jiachen Jiang , Yuxin Dong , Jinxin Zhou , Zhihui Zhu

Learning Linear Regression with Low-Rank Tasks in-Context

In-context learning (ICL) is a key building block of modern large language models, yet its theoretical mechanisms remain poorly understood. It is particularly mysterious how ICL operates in real-world applications where tasks have a common…

Disordered Systems and Neural Networks · Physics 2026-04-24 Kaito Takanami , Takashi Takahashi , Yoshiyuki Kabashima

Provable In-Context Vector Arithmetic via Retrieving Task Concepts

In-context learning (ICL) has garnered significant attention for its ability to grasp functions/tasks from demonstrations. Recent studies suggest the presence of a latent task/function vector in LLMs during ICL. Merullo et al. (2024) showed…

Machine Learning · Computer Science 2025-08-14 Dake Bu , Wei Huang , Andi Han , Atsushi Nitanda , Qingfu Zhang , Hau-San Wong , Taiji Suzuki

How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes

Large language models (LLM) have recently shown the extraordinary ability to perform unseen tasks based on few-shot examples provided as text, also known as in-context learning (ICL). While recent works have attempted to understand the…

Computation and Language · Computer Science 2024-04-05 Harmon Bhasin , Timothy Ossowski , Yiqiao Zhong , Junjie Hu

Unlocking In-Context Learning for Natural Datasets Beyond Language Modelling

Large Language Models (LLMs) exhibit In-Context Learning (ICL), which enables the model to perform new tasks conditioning only on the examples provided in the context without updating the model's weights. While ICL offers fast adaptation…

Computation and Language · Computer Science 2025-10-07 Jelena Bratulić , Sudhanshu Mittal , David T. Hoffmann , Samuel Böhm , Robin Tibor Schirrmeister , Tonio Ball , Christian Rupprecht , Thomas Brox

Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines

In-context learning (ICL) is an important yet not fully understood ability of pre-trained large language models (LLMs). It can greatly enhance task performance using a few examples, termed demonstrations, without fine-tuning. Although…

Computation and Language · Computer Science 2025-06-03 Do Xuan Long , Duong Ngoc Yen , Do Xuan Trong , Luu Anh Tuan , Kenji Kawaguchi , Shafiq Joty , Min-Yen Kan , Nancy F. Chen

Not All Demonstration Examples are Equally Beneficial: Reweighting Demonstration Examples for In-Context Learning

Large Language Models (LLMs) have recently gained the In-Context Learning (ICL) ability with the models scaling up, allowing them to quickly adapt to downstream tasks with only a few demonstration examples prepended in the input sequence.…

Computation and Language · Computer Science 2024-03-19 Zhe Yang , Damai Dai , Peiyi Wang , Zhifang Sui

In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax

In-context learning (ICL) is now a common method for teaching large language models (LLMs) new tasks: given labeled examples in the input context, the LLM learns to perform the task without weight updates. Do models guided via ICL infer the…

Computation and Language · Computer Science 2024-04-11 Aaron Mueller , Albert Webson , Jackson Petty , Tal Linzen

Which Attention Heads Matter for In-Context Learning?

Large language models (LLMs) exhibit impressive in-context learning (ICL) capability, enabling them to perform new tasks using only a few demonstrations in the prompt. Two different mechanisms have been proposed to explain ICL: induction…

Machine Learning · Computer Science 2025-05-05 Kayo Yin , Jacob Steinhardt