English
Related papers

Related papers: Context Parametrization with Compositional Adapter…

200 papers

Large language models (LLMs) often exhibit performance disparities across languages, with naive multilingual fine-tuning frequently degrading performance due to negative cross-lingual interference. To address this, we introduce COMPASS…

Machine Learning · Computer Science 2026-04-23 Noah Flynn

Large language models (LMs) are typically adapted to improve performance on new contexts (\eg text prompts that define new tasks or domains) through fine-tuning or prompting. However, there is an accuracy compute tradeoff -- fine-tuning…

Machine Learning · Computer Science 2024-11-12 Tong Chen , Hao Fang , Patrick Xia , Xiaodong Liu , Benjamin Van Durme , Luke Zettlemoyer , Jianfeng Gao , Hao Cheng

In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs). By instructing LLMs using few-shot demonstrative examples, ICL enables them to perform a wide range of tasks without…

Computation and Language · Computer Science 2024-08-21 Quanyu Long , Jianda Chen , Wenya Wang , Sinno Jialin Pan

In-context learning (ICL) of large language models (LLMs) has attracted increasing attention in the community where LLMs make predictions only based on instructions augmented with a few examples. Existing example selection methods for ICL…

Computation and Language · Computer Science 2024-08-26 Haowei Du , Dongyan Zhao

Adapter parameters provide a mechanism to modify the behavior of machine learning models and have gained significant popularity in the context of large language models (LLMs) and generative AI. These parameters can be merged to support…

Computation and Language · Computer Science 2026-03-13 Ondrej Bohdal , Mete Ozay , Jijoong Moon , Kyeng-Hun Lee , Hyeonmok Ko , Umberto Michieli

Integrating Large Language Models (LLMs) into complex software systems enables the generation of human-understandable explanations of opaque AI processes, such as automated task planning. However, the quality and reliability of these…

Artificial Intelligence · Computer Science 2026-04-24 Gricel Vázquez , Alexandros Evangelidis , Sepeedeh Shahbeigi , Radu Calinescu , Simos Gerasimou

In-context learning (ICL) allows large language models (LLMs) to adapt to new tasks directly from the given demonstrations without requiring gradient updates. While recent advances have expanded context windows to accommodate more…

Despite the rising prevalence of neural language models, recent empirical evidence suggests their deficiency in compositional generalization. One of the current de-facto solutions to this problem is compositional data augmentation, which…

Computation and Language · Computer Science 2025-03-03 Zhaoyi Li , Gangwei Jiang , Chenwang Wu , Ying Wei , Defu Lian , Enhong Chen

Large language models (LLMs) are commonly adapted for diverse downstream tasks via parameter-efficient fine-tuning techniques such as Low-Rank Adapters (LoRA). While adapters can be combined to handle multiple tasks separately, standard…

The success of large language models (LLMs), like GPT-4 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by finetuning open-access LLMs with task-specific data (e.g.,…

Computation and Language · Computer Science 2023-10-10 Zhiqiang Hu , Lei Wang , Yihuai Lan , Wanyu Xu , Ee-Peng Lim , Lidong Bing , Xing Xu , Soujanya Poria , Roy Ka-Wei Lee

While model serving has unlocked unprecedented capabilities, the high cost of serving large-scale models continues to be a significant barrier to widespread accessibility and rapid innovation. Compiler optimizations have long driven…

Machine Learning · Computer Science 2026-02-05 Annabelle Sujun Tang , Christopher Priebe , Rohan Mahapatra , Lianhui Qin , Hadi Esmaeilzadeh

Large Language Models (LLMs) face significant computational challenges when processing long contexts due to the quadratic complexity of self-attention. While soft context compression methods, which map input text to smaller latent…

Computation and Language · Computer Science 2025-09-24 Gabriele Berton , Jayakrishnan Unnikrishnan , Son Tran , Mubarak Shah

Efficient long-context LLM deployment is stalled by a dichotomy between amortized compression, which struggles with out-of-distribution generalization, and Test-Time Training, which incurs prohibitive synthetic data costs and requires…

Machine Learning · Computer Science 2026-02-26 Zeju Li , Yizhou Zhou , Qiang Xu

Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts, but there has been limited understanding of exactly how these explanations function or why they are effective. This work aims to…

Computation and Language · Computer Science 2023-06-14 Xi Ye , Srinivasan Iyer , Asli Celikyilmaz , Ves Stoyanov , Greg Durrett , Ramakanth Pasunuru

This thesis investigates two key phenomena in large language models (LLMs): in-context learning (ICL) and model collapse. We study ICL in a linear transformer with tied weights trained on linear regression tasks, and show that minimising…

Artificial Intelligence · Computer Science 2026-01-06 Josef Ott

Despite the surprising few-shot performance of in-context learning (ICL), it is still a common practice to randomly sample examples to serve as context. This paper advocates a new principle for ICL: self-adaptive in-context learning. The…

Computation and Language · Computer Science 2023-05-04 Zhiyong Wu , Yaoxiang Wang , Jiacheng Ye , Lingpeng Kong

Code snippet adaptation is a fundamental activity in the software development process. Unlike code generation, code snippet adaptation is not a "free creation", which requires developers to tailor a given code snippet in order to fit…

Software Engineering · Computer Science 2024-11-26 Tanghaoran Zhang , Yue Yu , Xinjun Mao , Shangwen Wang , Kang Yang , Yao Lu , Zhang Zhang , Yuxin Zhao

In-context learning (ICL) allows a language model to improve its problem-solving capability when provided with suitable information in context. Since the choice of in-context information can be determined based on the problem itself,…

Computation and Language · Computer Science 2025-09-12 Yinghui He , Abhishek Panigrahi , Yong Lin , Sanjeev Arora

Fine-tuning Large Language Models (LLMs) typically involves updating at least a few billions of parameters. A more parameter-efficient approach is Prompt Tuning (PT), which updates only a few learnable tokens, and differently, In-Context…

Computation and Language · Computer Science 2024-10-23 Tsachi Blau , Moshe Kimhi , Yonatan Belinkov , Alexander Bronstein , Chaim Baskin

As strong general reasoners, large language models (LLMs) encounter diverse domains and tasks, where the ability to adapt and self-improve at test time is valuable. We introduce MASS, a meta-learning framework that enables LLMs to…

Machine Learning · Computer Science 2026-03-10 Zeyneb N. Kaya , Nick Rui
‹ Prev 1 2 3 10 Next ›