Related papers: Merging by Matching Models in Task Parameter Subsp…

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data. Existing methods attempt to alleviate task conflicts by sparsifying task vectors or promoting orthogonality…

Machine Learning · Computer Science 2025-05-27 Yongxian Wei , Anke Tang , Li Shen , Zixuan Hu , Chun Yuan , Xiaochun Cao

Model Merging by Output-Space Projection

Model merging combines fine-tuned checkpoints into a single multi-task model without retraining. Existing methods - such as task arithmetic, model soups, TIES, and DARE - are computationally efficient and empirically successful, but rely on…

Machine Learning · Computer Science 2026-05-29 Bethan Evans , Benjamin Etheridge , Stephen Roberts , Jared Tanner

No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces

Model merging integrates the weights of multiple task-specific models into a single multi-task model. Despite recent interest in the problem, a significant performance gap between the combined and single-task models remains. In this paper,…

Machine Learning · Computer Science 2025-06-12 Daniel Marczak , Simone Magistri , Sebastian Cygert , Bartłomiej Twardowski , Andrew D. Bagdanov , Joost van de Weijer

Superpose Task-specific Features for Model Merging

Model merging enables powerful capabilities in neural networks without requiring additional training. In this paper, we introduce a novel perspective on model merging by leveraging the fundamental mechanisms of neural network…

Machine Learning · Computer Science 2025-09-19 Haiquan Qiu , You Wu , Dong Li , Jianmin Guo , Quanming Yao

ATM: Improving Model Merging by Alternating Tuning and Merging

Model merging has emerged as a cost-efficient approximation to multitask learning. Among merging strategies, task arithmetic is notable for its simplicity and effectiveness. In this work, we provide a theoretical motivation for task vectors…

Machine Learning · Computer Science 2025-08-11 Luca Zhou , Daniele Solombrino , Donato Crisostomi , Maria Sofia Bucarelli , Fabrizio Silvestri , Emanuele Rodolà

Model Merging in the Essential Subspace

Model merging aims to integrate multiple task-specific fine-tuned models derived from a shared pre-trained checkpoint into a single multi-task model without additional training. Despite extensive research, task interference remains a major…

Machine Learning · Computer Science 2026-02-25 Longhua Li , Lei Qi , Qi Tian , Xin Geng

From Task-Specific Models to Unified Systems: A Review of Model Merging Approaches

Model merging has achieved significant success, with numerous innovative methods proposed to enhance capabilities by combining multiple models. However, challenges persist due to the lack of a unified framework for classification and…

Machine Learning · Computer Science 2025-03-13 Wei Ruan , Tianze Yang , Yifan Zhou , Tianming Liu , Jin Lu

HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models

Model merging is a technique that combines multiple large pretrained models into a single model with enhanced performance and broader task adaptability. It has gained popularity in large pretrained model development due to its ability to…

Machine Learning · Computer Science 2024-09-30 Yu Zhou , Xingyu Wu , Jibin Wu , Liang Feng , Kay Chen Tan

Parameter Competition Balancing for Model Merging

While fine-tuning pretrained models has become common practice, these models often underperform outside their specific domains. Recently developed model merging techniques enable the direct integration of multiple models, each fine-tuned…

Computer Vision and Pattern Recognition · Computer Science 2024-10-04 Guodong Du , Junlin Lee , Jing Li , Runhua Jiang , Yifei Guo , Shuyang Yu , Hanting Liu , Sim Kuan Goh , Ho-Kin Tang , Daojing He , Min Zhang

Bridging Domains through Subspace-Aware Model Merging

Model merging integrates multiple task-specific models into a single consolidated one. Recent research has made progress in improving merging performance for in-distribution or multi-task scenarios, but domain generalization in model…

Machine Learning · Computer Science 2026-03-10 Levy Chaves , Chao Zhou , Rebekka Burkholz , Eduardo Valle , Sandra Avila

The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse

Model merging aims to efficiently combine the weights of multiple expert models, each trained on a specific task, into a single multi-task model, with strong performance across all tasks. When applied to all but the last layer of weights,…

Machine Learning · Computer Science 2024-10-17 Ekansh Sharma , Daniel M. Roy , Gintare Karolina Dziugaite

Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions

Model merging combines the parameters of multiple neural networks into a single model without additional training. As fine-tuned large language models (LLMs) proliferate, merging offers a computationally efficient alternative to ensembles…

Computation and Language · Computer Science 2026-03-31 Mingyang Song , Mao Zheng

An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

Model merging unifies independently fine-tuned LLMs from the same base, enabling reuse and integration of parallel development efforts without retraining. However, in practice we observe that merging does not always succeed: certain…

Artificial Intelligence · Computer Science 2026-03-11 Yuan Cao , Dezhi Ran , Yuzhe Guo , Mengzhou Wu , Simin Chen , Linyi Li , Wei Yang , Tao Xie

Non-Uniform Parameter-Wise Model Merging

Combining multiple machine learning models has long been a technique for enhancing performance, particularly in distributed settings. Traditional approaches, such as model ensembles, work well, but are expensive in terms of memory and…

Machine Learning · Computer Science 2024-12-23 Albert Manuel Orozco Camacho , Stefan Horoi , Guy Wolf , Eugene Belilovsky

MASS: MoErging through Adaptive Subspace Selection

Model merging has recently emerged as a lightweight alternative to ensembling, combining multiple fine-tuned models into a single set of parameters with no additional training overhead. Yet, existing merging methods fall short of matching…

Machine Learning · Computer Science 2026-03-18 Donato Crisostomi , Alessandro Zirilli , Antonio Andrea Gargiulo , Maria Sofia Bucarelli , Simone Scardapane , Fabrizio Silvestri , Iacopo Masi , Emanuele Rodolà

Localizing Task Information for Improved Model Merging and Compression

Model merging and task arithmetic have emerged as promising scalable approaches to merge multiple single-task checkpoints to one multi-task model, but their applicability is reduced by significant performance loss. Previous works have…

Machine Learning · Computer Science 2024-05-14 Ke Wang , Nikolaos Dimitriadis , Guillermo Ortiz-Jimenez , François Fleuret , Pascal Frossard

CAT Merging: A Training-Free Approach for Resolving Conflicts in Model Merging

Multi-task model merging offers a promising paradigm for integrating multiple expert models into a unified model without additional training. Existing state-of-the-art techniques, such as Task Arithmetic and its variants, merge models by…

Artificial Intelligence · Computer Science 2025-05-15 Wenju Sun , Qingyong Li , Yangli-ao Geng , Boyang Li

Why Do More Experts Fail? A Theoretical Analysis of Model Merging

Model merging dramatically reduces storage and computational resources by combining multiple expert models into a single multi-task model. Although recent model merging methods have shown promising results, they struggle to maintain…

Machine Learning · Computer Science 2025-06-04 Zijing Wang , Xingle Xu , Yongkang Liu , Yiqun Zhang , Peiqin Lin , Shi Feng , Xiaocui Yang , Daling Wang , Hinrich Schütze

Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs

Task arithmetic is a straightforward yet highly effective strategy for model merging, enabling the resultant model to exhibit multi-task capabilities. Recent research indicates that models demonstrating linearity enhance the performance of…

Machine Learning · Computer Science 2025-04-16 Rui Dai , Sile Hu , Xu Shen , Yonggang Zhang , Xinmei Tian , Jieping Ye

Parameter-Efficient Interventions for Enhanced Model Merging

Model merging combines knowledge from task-specific models into a unified multi-task model to avoid joint training on all task data. However, current methods face challenges due to representation bias, which can interfere with tasks…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Marcin Osial , Daniel Marczak , Bartosz Zieliński