English
Related papers

Related papers: Efficient LLM Context Distillation

200 papers

Given the success with in-context learning of large pre-trained language models, we introduce in-context learning distillation to transfer in-context few-shot learning ability from large models to smaller models. We propose to combine…

Computation and Language · Computer Science 2022-12-22 Yukun Huang , Yanda Chen , Zhou Yu , Kathleen McKeown

We applied few-shot in-context learning on the OPT-1.3B model for the natural language inference task and employed knowledge distillation to internalize the context information, reducing model parameter from 1.3B to 125M and achieving a…

Computation and Language · Computer Science 2024-12-19 Yifei Duan , Liu Li , Zirui Zhai , Jinxia Yao

As the field continues its push for ever more resources, this work turns the spotlight on a critical question: how can vision-language models (VLMs) be adapted to thrive in low-resource, budget-constrained settings? While large VLMs offer…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 Zhiqi Kang , Rahaf Aljundi , Vaggelis Dorovatas , Karteek Alahari

Real-world applications of large language models (LLMs) in computational social science (CSS) tasks primarily depend on the effectiveness of instruction tuning (IT) or in-context learning (ICL). While IT has shown highly effective at…

Computation and Language · Computer Science 2024-09-24 Taihang Wang , Xiaoman Xu , Yimin Wang , Ye Jiang

Large Multimodal Models (LMMs) often rely on in-context learning (ICL) to perform new visual question answering (VQA) tasks with minimal supervision. However, ICL performance, especially in smaller LMMs, does not always improve…

Artificial Intelligence · Computer Science 2026-03-03 Akash Gupta , Amos Storkey , Mirella Lapata

Recent advances in large language models (LLMs) enable effective in-context learning (ICL) with many-shot examples, but at the cost of high computational demand due to longer input tokens. To address this, we propose cheat-sheet ICL, which…

Computation and Language · Computer Science 2025-09-26 Ukyo Honda , Soichiro Murakami , Peinan Zhang

In the domain of large language models (LLMs), arXiv:2305.16938 showed that few-shot full-model fine-tuning -- namely Vanilla Fine Tuning (FT) and Pattern-Based Fine Tuning (PBFT) --, and In-Context Learning (ICL) generalize similarly on…

Computation and Language · Computer Science 2024-05-24 Krishna Prasad Varadarajan Srinivasan , Prasanth Gumpena , Madhusudhana Yattapu , Vishal H. Brahmbhatt

In-context learning (ICL) allows large language models (LLMs) to solve novel tasks without weight updates. Despite its empirical success, the mechanism behind ICL remains poorly understood, limiting our ability to interpret, improve, and…

Machine Learning · Computer Science 2025-06-16 Chengye Li , Haiyun Liu , Yuanxi Li

Recent interest has surged in employing Large Language Models (LLMs) for machine translation (MT) via in-context learning (ICL) (Vilar et al., 2023). Most prior studies primarily focus on optimizing translation quality, with limited…

Computation and Language · Computer Science 2024-06-06 Pranjal A. Chitale , Jay Gala , Raj Dabre

Fine-tuning Large Language Models (LLMs) typically involves updating at least a few billions of parameters. A more parameter-efficient approach is Prompt Tuning (PT), which updates only a few learnable tokens, and differently, In-Context…

Computation and Language · Computer Science 2024-10-23 Tsachi Blau , Moshe Kimhi , Yonatan Belinkov , Alexander Bronstein , Chaim Baskin

In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs). By instructing LLMs using few-shot demonstrative examples, ICL enables them to perform a wide range of tasks without…

Computation and Language · Computer Science 2024-08-21 Quanyu Long , Jianda Chen , Wenya Wang , Sinno Jialin Pan

When adapting large language models (LLMs) to a specific downstream task, two primary approaches are commonly employed: (1) prompt engineering, often with in-context few-shot learning, leveraging the model's inherent generalization…

Machine Learning · Computer Science 2025-12-24 Jorg Bornschein , Clare Lyle , Yazhe Li , Amal Rannen-Triki , Xu Owen He , Razvan Pascanu

The ability of generative large language models (LLMs) to perform in-context learning has given rise to a large body of research into how best to prompt models for various natural language processing tasks. In this paper, we focus on…

Computation and Language · Computer Science 2024-08-02 Armel Zebaze , Benoît Sagot , Rachel Bawden

Post-training endows pretrained LLMs with a variety of desirable skills, including instruction-following, reasoning, and others. However, these post-trained LLMs only encode knowledge up to a cut-off date, necessitating continual…

Computation and Language · Computer Science 2026-02-19 Shankar Padmanabhan , Mustafa Omer Gul , Tanya Goyal

Large language models (LLMs) excel in complex reasoning tasks, and distilling their reasoning capabilities into smaller models has shown promise. However, we uncover an interesting phenomenon, which we term the Small Model Learnability Gap:…

Artificial Intelligence · Computer Science 2025-11-14 Yuetai Li , Xiang Yue , Zhangchen Xu , Fengqing Jiang , Luyao Niu , Bill Yuchen Lin , Bhaskar Ramasubramanian , Radha Poovendran

In-Context Learning (ICL) is a technique by which language models make predictions based on examples provided in their input context. Previously, their context window size imposed a limit on the number of examples that can be shown, making…

Computation and Language · Computer Science 2025-05-29 Jinheon Baek , Sun Jae Lee , Prakhar Gupta , Geunseob Oh , Siddharth Dalmia , Prateek Kolhar

Long context understanding remains challenging for large language models due to their limited context windows. This paper introduces Long Input Fine-Tuning (LIFT) for long context modeling, a novel framework that enhances LLM performance on…

Computation and Language · Computer Science 2024-12-19 Yansheng Mao , Jiaqi Li , Fanxu Meng , Jing Xiong , Zilong Zheng , Muhan Zhang

With the increasing ability of large language models (LLMs), in-context learning (ICL) has evolved as a new paradigm for natural language processing (NLP), where instead of fine-tuning the parameters of an LLM specific to a downstream task…

Information Retrieval · Computer Science 2024-05-03 Andrew Parry , Debasis Ganguly , Manish Chandra

In-context learning (ICL) enables Large Language Models (LLMs) to adapt to new tasks with only a small set of examples at inference time, thereby avoiding task-specific fine-tuning. However, in-context examples may contain privacy-sensitive…

Machine Learning · Computer Science 2026-02-06 Rob Romijnders , Mohammad Mahdi Derakhshani , Jonathan Petit , Max Welling , Christos Louizos , Yuki M. Asano

Context distillation enables language models to internalize in-context knowledge into their parameters. In our work, we propose On-Policy Context Distillation (OPCD), a framework that bridges on-policy distillation with context distillation…

Computation and Language · Computer Science 2026-03-24 Tianzhu Ye , Li Dong , Xun Wu , Shaohan Huang , Furu Wei
‹ Prev 1 2 3 10 Next ›