Related papers: RanDeS: Randomized Delta Superposition for Multi-M…

Towards Reversible Model Merging For Low-rank Weights

Model merging aims to combine multiple fine-tuned models into a single set of weights that performs well across all source tasks. While prior work has shown that merging can approximate the performance of individual fine-tuned models for…

Machine Learning · Computer Science 2025-10-17 Mohammadsajad Alipour , Mohammad Mohammadi Amiri

Localizing Task Information for Improved Model Merging and Compression

Model merging and task arithmetic have emerged as promising scalable approaches to merge multiple single-task checkpoints to one multi-task model, but their applicability is reduced by significant performance loss. Previous works have…

Machine Learning · Computer Science 2024-05-14 Ke Wang , Nikolaos Dimitriadis , Guillermo Ortiz-Jimenez , François Fleuret , Pascal Frossard

Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation

Model merging aims to integrate multiple task-specific models into a unified model that inherits the capabilities of the task-specific models, without additional training. Existing model merging methods often lack consideration of the…

Computation and Language · Computer Science 2025-08-07 Yue Zhou , Yi Chang , Yuan Wu

Multi-Task Model Merging via Adaptive Weight Disentanglement

Model merging has recently gained attention as an economical and scalable approach to incorporate task-specific weights from various tasks into a unified multi-task model. For example, in Task Arithmetic (TA), adding the fine-tuned weights…

Machine Learning · Computer Science 2025-01-10 Feng Xiong , Runxi Cheng , Wang Chen , Zhanqiu Zhang , Yiwen Guo , Chun Yuan , Ruifeng Xu

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data. Existing methods attempt to alleviate task conflicts by sparsifying task vectors or promoting orthogonality…

Machine Learning · Computer Science 2025-05-27 Yongxian Wei , Anke Tang , Li Shen , Zixuan Hu , Chun Yuan , Xiaochun Cao

Resolving Interference (RI): Disentangling Models for Improved Model Merging

Model merging has shown that multitask models can be created by directly combining the parameters of different models that are each specialized on tasks of interest. However, models trained independently on distinct tasks often exhibit…

Machine Learning · Computer Science 2026-03-17 Pratik Ramesh , George Stoica , Arun Iyer , Leshem Choshen , Judy Hoffman

Merging by Matching Models in Task Parameter Subspaces

Model merging aims to cheaply combine individual task-specific models into a single multitask model. In this work, we view past merging methods as leveraging different notions of a ''task parameter subspace'' in which models are matched…

Machine Learning · Computer Science 2024-04-16 Derek Tam , Mohit Bansal , Colin Raffel

Modular Delta Merging with Orthogonal Constraints: A Scalable Framework for Continual and Reversible Model Composition

In real-world machine learning deployments, models must be continually updated, composed, and when required, selectively undone. However, existing approaches to model merging and continual learning often suffer from task interference,…

Machine Learning · Computer Science 2026-04-14 Haris Khan , Sadia Asif , Shumaila Asif , Muhammad Zeeshan Karamat , Rajesh Upadhayaya

Superpose Task-specific Features for Model Merging

Model merging enables powerful capabilities in neural networks without requiring additional training. In this paper, we introduce a novel perspective on model merging by leveraging the fundamental mechanisms of neural network…

Machine Learning · Computer Science 2025-09-19 Haiquan Qiu , You Wu , Dong Li , Jianmin Guo , Quanming Yao

Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking

In the era of large-scale training, model merging has evolved into a tool for creating multitasking models efficiently. It enables the knowledge of models to be fused, without the need for heavy computation as required in traditional…

Machine Learning · Computer Science 2025-10-30 Yuatyong Chaichana , Thanapat Trachu , Peerat Limkonchotiwat , Konpat Preechakul , Tirasan Khandhawit , Ekapol Chuangsuwanich

Model Merging by Output-Space Projection

Model merging combines fine-tuned checkpoints into a single multi-task model without retraining. Existing methods - such as task arithmetic, model soups, TIES, and DARE - are computationally efficient and empirically successful, but rely on…

Machine Learning · Computer Science 2026-05-29 Bethan Evans , Benjamin Etheridge , Stephen Roberts , Jared Tanner

Parameter-Efficient Interventions for Enhanced Model Merging

Model merging combines knowledge from task-specific models into a unified multi-task model to avoid joint training on all task data. However, current methods face challenges due to representation bias, which can interfere with tasks…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Marcin Osial , Daniel Marczak , Bartosz Zieliński

To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

Fine-tuning pre-trained models on targeted datasets enhances task-specific performance but often comes at the expense of generalization. Model merging techniques, which integrate multiple fine-tuned models into a single multi-task model…

Machine Learning · Computer Science 2025-09-11 Zitao Fang , Guodong DU , Shuyang Yu , Yifei Guo , Yiwei Zhang , Yiyao Cao , Jing Li , Ho-Kin Tang , Sim Kuan Goh

Revisiting Weight Averaging for Model Merging

Model merging aims to build a multi-task learner by combining the parameters of individually fine-tuned models without additional training. While a straightforward approach is to average model parameters across tasks, this often results in…

Machine Learning · Computer Science 2025-04-04 Jiho Choi , Donggyun Kim , Chanhyuk Lee , Seunghoon Hong

An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

Model merging unifies independently fine-tuned LLMs from the same base, enabling reuse and integration of parallel development efforts without retraining. However, in practice we observe that merging does not always succeed: certain…

Artificial Intelligence · Computer Science 2026-03-11 Yuan Cao , Dezhi Ran , Yuzhe Guo , Mengzhou Wu , Simin Chen , Linyi Li , Wei Yang , Tao Xie

SE-Merging: A Self-Enhanced Approach for Dynamic Model Merging

Model merging has gained increasing attention due to its intriguing property: interpolating the parameters of different task-specific fine-tuned models leads to multi-task abilities. However, despite its empirical success, the underlying…

Artificial Intelligence · Computer Science 2025-06-24 Zijun Chen , Zhanpeng Zhou , Bo Zhang , Weinan Zhang , Xi Sun , Junchi Yan

No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces

Model merging integrates the weights of multiple task-specific models into a single multi-task model. Despite recent interest in the problem, a significant performance gap between the combined and single-task models remains. In this paper,…

Machine Learning · Computer Science 2025-06-12 Daniel Marczak , Simone Magistri , Sebastian Cygert , Bartłomiej Twardowski , Andrew D. Bagdanov , Joost van de Weijer

Model Merging in the Essential Subspace

Model merging aims to integrate multiple task-specific fine-tuned models derived from a shared pre-trained checkpoint into a single multi-task model without additional training. Despite extensive research, task interference remains a major…

Machine Learning · Computer Science 2026-02-25 Longhua Li , Lei Qi , Qi Tian , Xin Geng

Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging

Fine-tuning large language models (LMs) for individual tasks yields strong performance but is expensive for deployment and storage. Recent works explore model merging to combine multiple task-specific models into a single multi-task model…

Computation and Language · Computer Science 2025-05-30 Haobo Zhang , Jiayu Zhou

The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse

Model merging aims to efficiently combine the weights of multiple expert models, each trained on a specific task, into a single multi-task model, with strong performance across all tasks. When applied to all but the last layer of weights,…

Machine Learning · Computer Science 2024-10-17 Ekansh Sharma , Daniel M. Roy , Gintare Karolina Dziugaite