Related papers: TIES-Merging: Resolving Interference When Merging …

Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models…

Computation and Language · Computer Science 2024-10-15 Zhenyi Lu , Chenghao Fan , Wei Wei , Xiaoye Qu , Dangyang Chen , Yu Cheng

Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning

Large-scale deep learning models with a pretraining-finetuning paradigm have led to a surge of numerous task-specific models fine-tuned from a common pre-trained model. Recently, several research efforts have been made on merging these…

Machine Learning · Computer Science 2025-04-22 Yeoreum Lee , Jinwook Jung , Sungyong Baik

Parameter-Efficient Interventions for Enhanced Model Merging

Model merging combines knowledge from task-specific models into a unified multi-task model to avoid joint training on all task data. However, current methods face challenges due to representation bias, which can interfere with tasks…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Marcin Osial , Daniel Marczak , Bartosz Zieliński

Resolving Interference (RI): Disentangling Models for Improved Model Merging

Model merging has shown that multitask models can be created by directly combining the parameters of different models that are each specialized on tasks of interest. However, models trained independently on distinct tasks often exhibit…

Machine Learning · Computer Science 2026-03-17 Pratik Ramesh , George Stoica , Arun Iyer , Leshem Choshen , Judy Hoffman

Model Merging in the Essential Subspace

Model merging aims to integrate multiple task-specific fine-tuned models derived from a shared pre-trained checkpoint into a single multi-task model without additional training. Despite extensive research, task interference remains a major…

Machine Learning · Computer Science 2026-02-25 Longhua Li , Lei Qi , Qi Tian , Xin Geng

Sens-Merging: Sensitivity-Guided Parameter Balancing for Merging Large Language Models

Recent advances in large language models have led to numerous task-specialized fine-tuned variants, creating a need for efficient model merging techniques that preserve specialized capabilities while avoiding costly retraining. While…

Computation and Language · Computer Science 2025-02-20 Shuqi Liu , Han Wu , Bowei He , Xiongwei Han , Mingxuan Yuan , Linqi Song

Parameter Competition Balancing for Model Merging

While fine-tuning pretrained models has become common practice, these models often underperform outside their specific domains. Recently developed model merging techniques enable the direct integration of multiple models, each fine-tuned…

Computer Vision and Pattern Recognition · Computer Science 2024-10-04 Guodong Du , Junlin Lee , Jing Li , Runhua Jiang , Yifei Guo , Shuyang Yu , Hanting Liu , Sim Kuan Goh , Ho-Kin Tang , Daojing He , Min Zhang

SE-Merging: A Self-Enhanced Approach for Dynamic Model Merging

Model merging has gained increasing attention due to its intriguing property: interpolating the parameters of different task-specific fine-tuned models leads to multi-task abilities. However, despite its empirical success, the underlying…

Artificial Intelligence · Computer Science 2025-06-24 Zijun Chen , Zhanpeng Zhou , Bo Zhang , Weinan Zhang , Xi Sun , Junchi Yan

EMR-Merging: Tuning-Free High-Performance Model Merging

The success of pretrain-finetune paradigm brings about the release of numerous model weights. In this case, merging models finetuned on different tasks to enable a single model with multi-task capabilities is gaining increasing attention…

Machine Learning · Computer Science 2024-09-30 Chenyu Huang , Peng Ye , Tao Chen , Tong He , Xiangyu Yue , Wanli Ouyang

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data. Existing methods attempt to alleviate task conflicts by sparsifying task vectors or promoting orthogonality…

Machine Learning · Computer Science 2025-05-27 Yongxian Wei , Anke Tang , Li Shen , Zixuan Hu , Chun Yuan , Xiaochun Cao

What Matters for Model Merging at Scale?

Model merging aims to combine multiple expert models into a more capable single model, offering benefits such as reduced storage and serving costs, improved generalization, and support for decentralized model development. Despite its…

Machine Learning · Computer Science 2024-10-07 Prateek Yadav , Tu Vu , Jonathan Lai , Alexandra Chronopoulou , Manaal Faruqui , Mohit Bansal , Tsendsuren Munkhdalai

To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

Fine-tuning pre-trained models on targeted datasets enhances task-specific performance but often comes at the expense of generalization. Model merging techniques, which integrate multiple fine-tuned models into a single multi-task model…

Machine Learning · Computer Science 2025-09-11 Zitao Fang , Guodong DU , Shuyang Yu , Yifei Guo , Yiwei Zhang , Yiyao Cao , Jing Li , Ho-Kin Tang , Sim Kuan Goh

MedMerge: Merging Models for Effective Transfer Learning to Medical Imaging Tasks

Transfer learning has become a powerful tool to initialize deep learning models to achieve faster convergence and higher performance. This is especially useful in the medical imaging analysis domain, where data scarcity limits possible…

Computer Vision and Pattern Recognition · Computer Science 2025-04-16 Ibrahim Almakky , Santosh Sanjeev , Anees Ur Rehman Hashmi , Mohammad Areeb Qazi , Hu Wang , Mohammad Yaqub

DivMerge: A divergence-based model merging method for multi-tasking

Multi-task learning (MTL) is often achieved by merging datasets before fine-tuning, but the growing availability of fine-tuned models has led to new approaches such as model merging via task arithmetic. A major challenge in this setting is…

Machine Learning · Computer Science 2025-09-15 Brahim Touayouch , Loïc Fosse , Géraldine Damnati , Gwénolé Lecorvé

Demystifying Mergeability: Interpretable Properties to Predict Model Merging Success

Model merging combines knowledge from separately fine-tuned models, yet the factors driving its success remain poorly understood. While recent work treats mergeability as an intrinsic property of the models, we show with an…

Machine Learning · Computer Science 2026-05-27 Luca Zhou , Bo Zhao , Rose Yu , Emanuele Rodolà

From Task-Specific Models to Unified Systems: A Review of Model Merging Approaches

Model merging has achieved significant success, with numerous innovative methods proposed to enhance capabilities by combining multiple models. However, challenges persist due to the lack of a unified framework for classification and…

Machine Learning · Computer Science 2025-03-13 Wei Ruan , Tianze Yang , Yifan Zhou , Tianming Liu , Jin Lu

Fine-Grained Model Merging via Modular Expert Recombination

Model merging constructs versatile models by integrating task-specific models without requiring labeled data or expensive joint retraining. Although recent methods improve adaptability to heterogeneous tasks by generating customized merged…

Machine Learning · Computer Science 2026-02-09 Haiyun Qiu , Xingyu Wu , Liang Feng , Kay Chen Tan

CAT Merging: A Training-Free Approach for Resolving Conflicts in Model Merging

Multi-task model merging offers a promising paradigm for integrating multiple expert models into a unified model without additional training. Existing state-of-the-art techniques, such as Task Arithmetic and its variants, merge models by…

Artificial Intelligence · Computer Science 2025-05-15 Wenju Sun , Qingyong Li , Yangli-ao Geng , Boyang Li

Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging

Adapting general-purpose language models to new skills is currently an expensive process that must be repeated as new instruction datasets targeting new skills are created, or can cause the models to forget older skills. In this work, we…

Computation and Language · Computer Science 2024-10-18 Jacob Morrison , Noah A. Smith , Hannaneh Hajishirzi , Pang Wei Koh , Jesse Dodge , Pradeep Dasigi

Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic

Model merging offers an effective strategy to combine the strengths of multiple finetuned models into a unified model that preserves the specialized capabilities of each. Existing methods merge models in a global manner, performing…

Machine Learning · Computer Science 2025-01-08 Yifei He , Yuzheng Hu , Yong Lin , Tong Zhang , Han Zhao