Related papers: In-Context Semi-Supervised Learning

When and How Unlabeled Data Provably Improve In-Context Learning

Recent research shows that in-context learning (ICL) can be effective even when demonstrations have missing or incorrect labels. To shed light on this capability, we examine a canonical setting where the demonstrations are drawn according…

Machine Learning · Computer Science 2026-01-27 Yingcong Li , Xiangyu Chang , Muti Kara , Xiaofeng Liu , Amit Roy-Chowdhury , Samet Oymak

In-Context Learning with Representations: Contextual Generalization of Trained Transformers

In-context learning (ICL) refers to a remarkable capability of pretrained large language models, which can learn a new task given a few examples during inference. However, theoretical understanding of ICL is largely under-explored,…

Machine Learning · Computer Science 2024-09-27 Tong Yang , Yu Huang , Yingbin Liang , Yuejie Chi

Unlabeled Data Can Provably Enhance In-Context Learning of Transformers

Large language models (LLMs) exhibit impressive in-context learning (ICL) capabilities, yet the quality of their predictions is fundamentally limited by the few costly labeled demonstrations that can fit into a prompt. Meanwhile, there…

Machine Learning · Computer Science 2026-01-16 Renpu Liu , Jing Yang

Semi-Supervised Learning in the Few-Shot Zero-Shot Scenario

Semi-Supervised Learning (SSL) is a framework that utilizes both labeled and unlabeled data to enhance model performance. Conventional SSL methods operate under the assumption that labeled and unlabeled data share the same label space.…

Computer Vision and Pattern Recognition · Computer Science 2023-11-16 Noam Fluss , Guy Hacohen , Daphna Weinshall

Exploring the Robustness of In-Context Learning with Noisy Labels

Recently, the mysterious In-Context Learning (ICL) ability exhibited by Transformer architectures, especially in large language models (LLMs), has sparked significant research interest. However, the resilience of Transformers' in-context…

Computation and Language · Computer Science 2024-05-02 Chen Cheng , Xinzhi Yu , Haodong Wen , Jingsong Sun , Guanzhang Yue , Yihao Zhang , Zeming Wei

Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning

Large language models (LLMs) have shown remarkable capacity for in-context learning (ICL), where learning a new task from just a few training examples is done without being explicitly pre-trained. However, despite the success of LLMs, there…

Computation and Language · Computer Science 2023-08-02 Xindi Wang , Yufei Wang , Can Xu , Xiubo Geng , Bowen Zhang , Chongyang Tao , Frank Rudzicz , Robert E. Mercer , Daxin Jiang

Does In-Context Learning Really Learn? Rethinking How Large Language Models Respond and Solve Tasks via In-Context Learning

In-context Learning (ICL) has emerged as a powerful capability alongside the development of scaled-up large language models (LLMs). By instructing LLMs using few-shot demonstrative examples, ICL enables them to perform a wide range of tasks…

Computation and Language · Computer Science 2024-07-24 Quanyu Long , Yin Wu , Wenya Wang , Sinno Jialin Pan

On the Relationship Between the Choice of Representation and In-Context Learning

In-context learning (ICL) is the ability of a large language model (LLM) to learn a new task from a few demonstrations presented as part of the context. Past studies have attributed a large portion of the success of ICL to the way these…

Computation and Language · Computer Science 2025-10-10 Ioana Marinescu , Kyunghyun Cho , Eric Karl Oermann

Disentangling Latent Shifts of In-Context Learning with Weak Supervision

In-context learning (ICL) enables large language models to perform few-shot learning by conditioning on labeled examples in the prompt. Despite its flexibility, ICL suffers from instability -- especially as prompt length increases with more…

Computation and Language · Computer Science 2025-10-27 Josip Jukić , Jan Šnajder

How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations

While large language models based on the transformer architecture have demonstrated remarkable in-context learning (ICL) capabilities, understandings of such capabilities are still in an early stage, where existing theory and mechanistic…

Machine Learning · Computer Science 2023-10-17 Tianyu Guo , Wei Hu , Song Mei , Huan Wang , Caiming Xiong , Silvio Savarese , Yu Bai

Can Transformers Learn Sequential Function Classes In Context?

In-context learning (ICL) has revolutionized the capabilities of transformer models in NLP. In our project, we extend the understanding of the mechanisms underpinning ICL by exploring whether transformers can learn from sequential,…

Machine Learning · Computer Science 2023-12-22 Ryan Campbell , Emma Guo , Evan Hu , Reya Vir , Ethan Hsiao

In-Context Learning with Long-Context Models: An In-Depth Exploration

As model context lengths continue to increase, the number of demonstrations that can be provided in-context approaches the size of entire training datasets. We study the behavior of in-context learning (ICL) at this extreme scale on…

Computation and Language · Computer Science 2025-03-05 Amanda Bertsch , Maor Ivgi , Emily Xiao , Uri Alon , Jonathan Berant , Matthew R. Gormley , Graham Neubig

Transformers are Minimax Optimal Nonparametric In-Context Learners

In-context learning (ICL) of large language models has proven to be a surprisingly effective method of learning a new task from only a few demonstrative examples. In this paper, we study the efficacy of ICL from the viewpoint of statistical…

Machine Learning · Statistics 2024-10-03 Juno Kim , Tai Nakamaki , Taiji Suzuki

Semi-supervised Learning with Contrastive Predicative Coding

Semi-supervised learning (SSL) provides a powerful framework for leveraging unlabeled data when labels are limited or expensive to obtain. SSL algorithms based on deep neural networks have recently proven successful on standard benchmark…

Machine Learning · Computer Science 2019-05-28 Jiaxing Wang , Yin Zheng , Xiaoshuang Chen , Junzhou Huang , Jian Cheng

Transformers Don't In-Context Learn Least Squares Regression

In-context learning (ICL) has emerged as a powerful capability of large pretrained transformers, enabling them to solve new tasks implicit in example input-output pairs without any gradient updates. Despite its practical success, the…

Machine Learning · Computer Science 2025-07-15 Joshua Hill , Benjamin Eyre , Elliot Creager

In-Context Learning Learns Label Relationships but Is Not Conventional Learning

The predictions of Large Language Models (LLMs) on downstream tasks often improve significantly when including examples of the input--label relationship in the context. However, there is currently no consensus about how this in-context…

Computation and Language · Computer Science 2024-03-14 Jannik Kossen , Yarin Gal , Tom Rainforth

Can Transformers Break Encryption Schemes via In-Context Learning?

In-context learning (ICL) has emerged as a powerful capability of transformer-based language models, enabling them to perform tasks by conditioning on a small number of examples presented at inference time, without any parameter updates.…

Machine Learning · Computer Science 2025-08-15 Jathin Korrapati , Patrick Mendoza , Aditya Tomar , Abein Abraham

Investigation into In-Context Learning Capabilities of Transformers

Transformers have demonstrated a strong ability for in-context learning (ICL), enabling models to solve previously unseen tasks using only example input output pairs provided at inference time. While prior theoretical work has established…

Machine Learning · Computer Science 2026-05-19 Rushil Chandrupatla , Leo Bangayan , Sebastian Leng

Theoretical Understanding of In-Context Learning in Shallow Transformers with Unstructured Data

Large language models (LLMs) are powerful models that can learn concepts at the inference stage via in-context learning (ICL). While theoretical studies, e.g., \cite{zhang2023trained}, attempt to explain the mechanism of ICL, they assume…

Machine Learning · Computer Science 2024-06-19 Yue Xing , Xiaofeng Lin , Chenheng Xu , Namjoon Suh , Qifan Song , Guang Cheng

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks

We study the phenomenon of \textit{in-context learning} (ICL) exhibited by large language models, where they can adapt to a new learning task, given a handful of labeled examples, without any explicit parameter optimization. Our goal is to…

Machine Learning · Computer Science 2023-05-29 Jacob Abernethy , Alekh Agarwal , Teodor V. Marinov , Manfred K. Warmuth