Related papers: In-Context Learning Creates Task Vectors

Re-examining learning linear functions in context

In-context learning (ICL) has emerged as a powerful paradigm for easily adapting Large Language Models (LLMs) to various tasks. However, our understanding of how ICL works remains limited. We explore a simple model of ICL in a controlled…

Machine Learning · Computer Science 2025-09-03 Omar Naim , Guilhem Fouilhé , Nicholas Asher

Learning Task Representations from In-Context Learning

Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning (ICL), where models adapt to new tasks through example-based prompts without requiring parameter updates. However, understanding how tasks are…

Computation and Language · Computer Science 2025-11-11 Baturay Saglam , Xinyang Hu , Zhuoran Yang , Dionysis Kalogerias , Amin Karbasi

What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning

Large language models (LLMs) exploit in-context learning (ICL) to solve tasks with only a few demonstrations, but its mechanisms are not yet well-understood. Some works suggest that LLMs only recall already learned concepts from…

Computation and Language · Computer Science 2023-05-18 Jane Pan , Tianyu Gao , Howard Chen , Danqi Chen

What Do Language Models Learn in Context? The Structured Task Hypothesis

Large language models (LLMs) exhibit an intriguing ability to learn a novel task from in-context examples presented in a demonstration, termed in-context learning (ICL). Understandably, a swath of research has been dedicated to uncovering…

Computation and Language · Computer Science 2024-08-06 Jiaoda Li , Yifan Hou , Mrinmaya Sachan , Ryan Cotterell

A Survey on In-context Learning

With the increasing capabilities of large language models (LLMs), in-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP), where LLMs make predictions based on contexts augmented with a few examples. It…

Computation and Language · Computer Science 2024-10-08 Qingxiu Dong , Lei Li , Damai Dai , Ce Zheng , Jingyuan Ma , Rui Li , Heming Xia , Jingjing Xu , Zhiyong Wu , Tianyu Liu , Baobao Chang , Xu Sun , Lei Li , Zhifang Sui

In-Context Learning through the Bayesian Prism

In-context learning (ICL) is one of the surprising and useful features of large language models and subject of intense research. Recently, stylized meta-learning-like ICL setups have been devised that train transformers on sequences of…

Machine Learning · Computer Science 2024-04-16 Madhur Panwar , Kabir Ahuja , Navin Goyal

A Survey to Recent Progress Towards Understanding In-Context Learning

In-Context Learning (ICL) empowers Large Language Models (LLMs) with the ability to learn from a few examples provided in the prompt, enabling downstream generalization without the requirement for gradient updates. Despite encouragingly…

Computation and Language · Computer Science 2025-01-28 Haitao Mao , Guangliang Liu , Yao Ma , Rongrong Wang , Kristen Johnson , Jiliang Tang

Context-Scaling versus Task-Scaling in In-Context Learning

Transformers exhibit In-Context Learning (ICL), where these models solve new tasks by using examples in the prompt without additional training. In our work, we identify and analyze two key components of ICL: (1) context-scaling, where model…

Machine Learning · Computer Science 2024-10-17 Amirhesam Abedsoltan , Adityanarayanan Radhakrishnan , Jingfeng Wu , Mikhail Belkin

How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes

Large language models (LLM) have recently shown the extraordinary ability to perform unseen tasks based on few-shot examples provided as text, also known as in-context learning (ICL). While recent works have attempted to understand the…

Computation and Language · Computer Science 2024-04-05 Harmon Bhasin , Timothy Ossowski , Yiqiao Zhong , Junjie Hu

How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations

While large language models based on the transformer architecture have demonstrated remarkable in-context learning (ICL) capabilities, understandings of such capabilities are still in an early stage, where existing theory and mechanistic…

Machine Learning · Computer Science 2023-10-17 Tianyu Guo , Wei Hu , Song Mei , Huan Wang , Caiming Xiong , Silvio Savarese , Yu Bai

Label Words as Local Task Vectors in In-Context Learning

Large Language Models (LLMs) have demonstrated remarkable abilities, one of the most important being in-context learning (ICL). With ICL, LLMs can derive the underlying rule from a few demonstrations and provide answers that comply with the…

Computation and Language · Computer Science 2025-12-23 Bowen Zheng , Ming Ma , Zhongqiao Lin , Tianming Yang

In-Context Learning with Representations: Contextual Generalization of Trained Transformers

In-context learning (ICL) refers to a remarkable capability of pretrained large language models, which can learn a new task given a few examples during inference. However, theoretical understanding of ICL is largely under-explored,…

Machine Learning · Computer Science 2024-09-27 Tong Yang , Yu Huang , Yingbin Liang , Yuejie Chi

Does In-Context Learning Really Learn? Rethinking How Large Language Models Respond and Solve Tasks via In-Context Learning

In-context Learning (ICL) has emerged as a powerful capability alongside the development of scaled-up large language models (LLMs). By instructing LLMs using few-shot demonstrative examples, ICL enables them to perform a wide range of tasks…

Computation and Language · Computer Science 2024-07-24 Quanyu Long , Yin Wu , Wenya Wang , Sinno Jialin Pan

Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study

Large language models (LLMs) like GPT-4 and LLaMA-3 utilize the powerful in-context learning (ICL) capability of Transformer architecture to learn on the fly from limited examples. While ICL underpins many LLM applications, its full…

Machine Learning · Computer Science 2025-03-21 Xingxuan Zhang , Haoran Wang , Jiansheng Li , Yuan Xue , Shikai Guan , Renzhe Xu , Hao Zou , Han Yu , Peng Cui

Learning Linear Regression with Low-Rank Tasks in-Context

In-context learning (ICL) is a key building block of modern large language models, yet its theoretical mechanisms remain poorly understood. It is particularly mysterious how ICL operates in real-world applications where tasks have a common…

Disordered Systems and Neural Networks · Physics 2026-04-24 Kaito Takanami , Takashi Takahashi , Yoshiyuki Kabashima

In-Context Language Learning: Architectures and Algorithms

Large-scale neural language models exhibit a remarkable capacity for in-context learning (ICL): they can infer novel functions from datasets provided as input. Most of our current understanding of when and how ICL arises comes from LMs…

Computation and Language · Computer Science 2024-01-31 Ekin Akyürek , Bailin Wang , Yoon Kim , Jacob Andreas

Unlocking In-Context Learning for Natural Datasets Beyond Language Modelling

Large Language Models (LLMs) exhibit In-Context Learning (ICL), which enables the model to perform new tasks conditioning only on the examples provided in the context without updating the model's weights. While ICL offers fast adaptation…

Computation and Language · Computer Science 2025-10-07 Jelena Bratulić , Sudhanshu Mittal , David T. Hoffmann , Samuel Böhm , Robin Tibor Schirrmeister , Tonio Ball , Christian Rupprecht , Thomas Brox

Understanding Task Vectors in In-Context Learning: Emergence, Functionality, and Limitations

Task vectors offer a compelling mechanism for accelerating inference in in-context learning (ICL) by distilling task-specific information into a single, reusable representation. Despite their empirical success, the underlying principles…

Machine Learning · Computer Science 2025-06-11 Yuxin Dong , Jiachen Jiang , Zhihui Zhu , Xia Ning

Transformers as Algorithms: Generalization and Stability in In-context Learning

In-context learning (ICL) is a type of prompting where a transformer model operates on a sequence of (input, output) examples and performs inference on-the-fly. In this work, we formalize in-context learning as an algorithm learning problem…

Machine Learning · Computer Science 2023-02-07 Yingcong Li , M. Emrullah Ildiz , Dimitris Papailiopoulos , Samet Oymak

Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models

Transformer models, notably large language models (LLMs), have the remarkable ability to perform in-context learning (ICL) -- to perform new tasks when prompted with unseen input-output examples without any explicit model training. In this…

Machine Learning · Computer Science 2023-11-03 Steve Yadlowsky , Lyric Doshi , Nilesh Tripuraneni