Related papers: Efficient Model Development through Fine-tuning Tr…

Donors and Recipients: On Asymmetric Transfer Across Tasks and Languages with Parameter-Efficient Fine-Tuning

Large language models (LLMs) perform strongly across tasks and languages, yet how improvements in one task or language affect other tasks and languages remains poorly understood. We conduct a controlled LoRA fine-tuning study across…

Computation and Language · Computer Science 2026-01-09 Kajetan Dymkiewicz , Ivan Vulic , Helen Yannakoudakis , Eilam Shapira , Roi Reichart , Anna Korhonen

Large Language Models to Diffusion Finetuning

We propose a new finetuning method to provide pre-trained large language models (LMs) the ability to scale test-time compute through the diffusion framework. By increasing the number of diffusion steps, we show our finetuned models achieve…

Computation and Language · Computer Science 2025-06-04 Edoardo Cetin , Tianyu Zhao , Yujin Tang

The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs

Large language models (LLMs) still struggle across tasks outside of high-resource languages. In this work, we investigate cross-lingual transfer to lower-resource languages where task-specific post-training data is scarce. Building on prior…

Computation and Language · Computer Science 2025-10-09 Lucas Bandarkar , Nanyun Peng

A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Generative Large Language Models (LLMs) have achieved remarkable advancements in various NLP tasks. However, these advances have not been reflected in the translation task, especially those with moderate model sizes (i.e., 7B or 13B…

Computation and Language · Computer Science 2024-02-07 Haoran Xu , Young Jin Kim , Amr Sharaf , Hany Hassan Awadalla

MUSCLE: A Model Update Strategy for Compatible LLM Evolution

Large Language Models (LLMs) are regularly updated to enhance performance, typically through changes in data or architecture. Within the update process, developers often prioritize improving overall performance metrics, paying less…

Artificial Intelligence · Computer Science 2024-10-07 Jessica Echterhoff , Fartash Faghri , Raviteja Vemulapalli , Ting-Yao Hu , Chun-Liang Li , Oncel Tuzel , Hadi Pouransari

Update Your Transformer to the Latest Release: Re-Basin of Task Vectors

Foundation models serve as the backbone for numerous specialized models developed through fine-tuning. However, when the underlying pretrained model is updated or retrained (e.g., on larger and more curated datasets), the fine-tuned model…

Machine Learning · Computer Science 2025-05-30 Filippo Rinaldi , Giacomo Capitani , Lorenzo Bonicelli , Donato Crisostomi , Federico Bolelli , Elisa Ficarra , Emanuele Rodolà , Simone Calderara , Angelo Porrello

Transfer Learning for Finetuning Large Language Models

As the landscape of large language models expands, efficiently finetuning for specific tasks becomes increasingly crucial. At the same time, the landscape of parameter-efficient finetuning methods rapidly expands. Consequently,…

Computation and Language · Computer Science 2024-11-05 Tobias Strangmann , Lennart Purucker , Jörg K. H. Franke , Ivo Rapant , Fabio Ferreira , Frank Hutter

Adapt Once, Thrive with Updates: Transferable Parameter-Efficient Fine-Tuning on Evolving Base Models

Parameter-efficient fine-tuning (PEFT) has become a common method for fine-tuning large language models, where a base model can serve multiple users through PEFT module switching. To enhance user experience, base models require periodic…

Computation and Language · Computer Science 2025-06-10 Naibin Gu , Peng Fu , Xiyu Liu , Ke Ma , Zheng Lin , Weiping Wang

Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches

Since the release of GPT2-1.5B in 2019, the large language models (LLMs) have evolved from specialized deep models to versatile foundation models. While demonstrating remarkable zero-shot ability, the LLMs still require fine-tuning on local…

Artificial Intelligence · Computer Science 2025-08-07 Yanjie Dong , Haijun Zhang , Chengming Li , Song Guo , Victor C. M. Leung , Xiping Hu

How Much Data is Enough Data? Fine-Tuning Large Language Models for In-House Translation: Performance Evaluation Across Multiple Dataset Sizes

Decoder-only LLMs have shown impressive performance in MT due to their ability to learn from extensive datasets and generate high-quality translations. However, LLMs often struggle with the nuances and style required for…

Computation and Language · Computer Science 2024-09-11 Inacio Vieira , Will Allred , Séamus Lankford , Sheila Castilho , Andy Way

Can LLMs' Tuning Methods Work in Medical Multimodal Domain?

While Large Language Models (LLMs) excel in world knowledge understanding, adapting them to specific subfields requires precise adjustments. Due to the model's vast scale, traditional global fine-tuning methods for large models can be…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 Jiawei Chen , Yue Jiang , Dingkang Yang , Mingcheng Li , Jinjie Wei , Ziyun Qian , Lihua Zhang

The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities

Fine-tuning large language models (LLMs) for machine translation has shown improvements in overall translation quality. However, it is unclear what is the impact of fine-tuning on desirable LLM behaviors that are not present in neural…

Computation and Language · Computer Science 2024-08-07 David Stap , Eva Hasler , Bill Byrne , Christof Monz , Ke Tran

Simple and Effective Input Reformulations for Translation

Foundation language models learn from their finetuning input context in different ways. In this paper, we reformulate inputs during finetuning for challenging translation tasks, leveraging model strengths from pretraining in novel ways to…

Computation and Language · Computer Science 2026-01-05 Brian Yu , Hansen Lillemark , Kurt Keutzer

Fine-tuning Large Language Models for Entity Matching

Generative large language models (LLMs) are a promising alternative to pre-trained language models for entity matching due to their high zero-shot performance and ability to generalize to unseen entities. Existing research on using LLMs for…

Computation and Language · Computer Science 2025-05-22 Aaron Steiner , Ralph Peeters , Christian Bizer

A Novel Paradigm Boosting Translation Capabilities of Large Language Models

This paper presents a study on strategies to enhance the translation capabilities of large language models (LLMs) in the context of machine translation (MT) tasks. The paper proposes a novel paradigm consisting of three stages: Secondary…

Computation and Language · Computer Science 2024-04-16 Jiaxin Guo , Hao Yang , Zongyao Li , Daimeng Wei , Hengchao Shang , Xiaoyu Chen

$\alpha$-LoRA: Effective Fine-Tuning via Base Model Rescaling

Fine-tuning has proven to be highly effective in adapting pre-trained models to perform better on new desired tasks with minimal data samples. Among the most widely used approaches are reparameterization methods, which update a target…

Machine Learning · Computer Science 2025-10-27 Aymane El Firdoussi , El Mahdi Chayti , Mohamed El Amine Seddik , Martin Jaggi

Analyzing and Reducing the Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast

Existing research has shown that a multilingual pre-trained language model fine-tuned with one (source) language also performs well on downstream tasks for non-source languages, even though no fine-tuning is done on these languages.…

Computation and Language · Computer Science 2023-05-22 Yiduo Guo , Yaobo Liang , Dongyan Zhao , Bing Liu , Duan Nan

Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs

While large language models demonstrate remarkable capabilities at task-specific applications through fine-tuning, extending these benefits across diverse languages is essential for broad accessibility. However, effective cross-lingual…

Computation and Language · Computer Science 2025-06-03 Danni Liu , Jan Niehues

Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs

Large Language Models (LLMs) have demonstrated significant potential in transforming clinical applications. In this study, we investigate the efficacy of four techniques in adapting LLMs for clinical use-cases: continuous pretraining,…

Computation and Language · Computer Science 2024-09-24 Clément Christophe , Tathagata Raha , Svetlana Maslenkova , Muhammad Umar Salman , Praveen K Kanithi , Marco AF Pimentel , Shadab Khan

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Large language models (LLMs) are useful in many NLP tasks and become more capable with size, with the best open-source models having over 50 billion parameters. However, using these 50B+ models requires high-end hardware, making them…

Machine Learning · Computer Science 2023-12-14 Alexander Borzunov , Max Ryabinin , Artem Chumachenko , Dmitry Baranchuk , Tim Dettmers , Younes Belkada , Pavel Samygin , Colin Raffel