Related papers: Parameter-Efficient Interventions for Enhanced Mod…

SyMerge: From Non-Interference to Synergistic Merging via Single-Layer Adaptation

Model merging combines independently trained models into a single multi-task model. However, most existing approaches focus primarily on avoiding task interference. We argue that its greater potential lies in enabling task synergy, where…

Machine Learning · Computer Science 2026-05-25 Aecheon Jung , Seunghwan Lee , Dongyoon Han , Sungeun Hong

Parameter Competition Balancing for Model Merging

While fine-tuning pretrained models has become common practice, these models often underperform outside their specific domains. Recently developed model merging techniques enable the direct integration of multiple models, each fine-tuned…

Computer Vision and Pattern Recognition · Computer Science 2024-10-04 Guodong Du , Junlin Lee , Jing Li , Runhua Jiang , Yifei Guo , Shuyang Yu , Hanting Liu , Sim Kuan Goh , Ho-Kin Tang , Daojing He , Min Zhang

Resolving Interference (RI): Disentangling Models for Improved Model Merging

Model merging has shown that multitask models can be created by directly combining the parameters of different models that are each specialized on tasks of interest. However, models trained independently on distinct tasks often exhibit…

Machine Learning · Computer Science 2026-03-17 Pratik Ramesh , George Stoica , Arun Iyer , Leshem Choshen , Judy Hoffman

Model Merging in the Essential Subspace

Model merging aims to integrate multiple task-specific fine-tuned models derived from a shared pre-trained checkpoint into a single multi-task model without additional training. Despite extensive research, task interference remains a major…

Machine Learning · Computer Science 2026-02-25 Longhua Li , Lei Qi , Qi Tian , Xin Geng

Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning

Large-scale deep learning models with a pretraining-finetuning paradigm have led to a surge of numerous task-specific models fine-tuned from a common pre-trained model. Recently, several research efforts have been made on merging these…

Machine Learning · Computer Science 2025-04-22 Yeoreum Lee , Jinwook Jung , Sungyong Baik

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data. Existing methods attempt to alleviate task conflicts by sparsifying task vectors or promoting orthogonality…

Machine Learning · Computer Science 2025-05-27 Yongxian Wei , Anke Tang , Li Shen , Zixuan Hu , Chun Yuan , Xiaochun Cao

TIES-Merging: Resolving Interference When Merging Models

Transfer learning - i.e., further fine-tuning a pre-trained model on a downstream task - can confer significant advantages, including improved downstream performance, faster convergence, and better sample efficiency. These advantages have…

Machine Learning · Computer Science 2023-10-30 Prateek Yadav , Derek Tam , Leshem Choshen , Colin Raffel , Mohit Bansal

Non-Uniform Parameter-Wise Model Merging

Combining multiple machine learning models has long been a technique for enhancing performance, particularly in distributed settings. Traditional approaches, such as model ensembles, work well, but are expensive in terms of memory and…

Machine Learning · Computer Science 2024-12-23 Albert Manuel Orozco Camacho , Stefan Horoi , Guy Wolf , Eugene Belilovsky

Revisiting Weight Averaging for Model Merging

Model merging aims to build a multi-task learner by combining the parameters of individually fine-tuned models without additional training. While a straightforward approach is to average model parameters across tasks, this often results in…

Machine Learning · Computer Science 2025-04-04 Jiho Choi , Donggyun Kim , Chanhyuk Lee , Seunghoon Hong

Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models…

Computation and Language · Computer Science 2024-10-15 Zhenyi Lu , Chenghao Fan , Wei Wei , Xiaoye Qu , Dangyang Chen , Yu Cheng

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts

Merging various task-specific Transformer-based models trained on different tasks into a single unified model can execute all the tasks concurrently. Previous methods, exemplified by task arithmetic, have been proven to be both effective…

Machine Learning · Computer Science 2024-06-10 Anke Tang , Li Shen , Yong Luo , Nan Yin , Lefei Zhang , Dacheng Tao

Pareto Merging: Multi-Objective Optimization for Preference-Aware Model Merging

Model merging, which combines multiple models into a single model, has gained popularity in recent years. By efficiently integrating the capabilities of various models, this significantly reduces the parameter count and memory usage.…

Machine Learning · Computer Science 2025-02-11 Weiyu Chen , James Kwok

MedMerge: Merging Models for Effective Transfer Learning to Medical Imaging Tasks

Transfer learning has become a powerful tool to initialize deep learning models to achieve faster convergence and higher performance. This is especially useful in the medical imaging analysis domain, where data scarcity limits possible…

Computer Vision and Pattern Recognition · Computer Science 2025-04-16 Ibrahim Almakky , Santosh Sanjeev , Anees Ur Rehman Hashmi , Mohammad Areeb Qazi , Hu Wang , Mohammad Yaqub

To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

Fine-tuning pre-trained models on targeted datasets enhances task-specific performance but often comes at the expense of generalization. Model merging techniques, which integrate multiple fine-tuned models into a single multi-task model…

Machine Learning · Computer Science 2025-09-11 Zitao Fang , Guodong DU , Shuyang Yu , Yifei Guo , Yiwei Zhang , Yiyao Cao , Jing Li , Ho-Kin Tang , Sim Kuan Goh

Dynamic Fisher-weighted Model Merging via Bayesian Optimization

The fine-tuning of pre-trained language models has resulted in the widespread availability of task-specific models. Model merging offers an efficient way to create multi-task models by combining these fine-tuned models at the parameter…

Computation and Language · Computer Science 2025-04-29 Sanwoo Lee , Jiahao Liu , Qifan Wang , Jingang Wang , Xunliang Cai , Yunfang Wu

Merging by Matching Models in Task Parameter Subspaces

Model merging aims to cheaply combine individual task-specific models into a single multitask model. In this work, we view past merging methods as leveraging different notions of a ''task parameter subspace'' in which models are matched…

Machine Learning · Computer Science 2024-04-16 Derek Tam , Mohit Bansal , Colin Raffel

Model Merging via Data-Free Covariance Estimation

Model merging provides a way of cheaply combining individual models to produce a model that inherits each individual's capabilities. While some merging methods can approach the performance of multitask training, they are often heuristically…

Machine Learning · Computer Science 2026-04-03 Marawan Gamal Abdel Hameed , Derek Tam , Pascal Jr Tikeng Notsawo , Colin Raffel , Guillaume Rabusseau

Representation Surgery for Multi-Task Model Merging

Multi-task learning (MTL) compresses the information from multiple tasks into a unified backbone to improve computational efficiency and generalization. Recent work directly merges multiple independently trained models to perform MTL…

Machine Learning · Computer Science 2024-05-29 Enneng Yang , Li Shen , Zhenyi Wang , Guibing Guo , Xiaojun Chen , Xingwei Wang , Dacheng Tao

Fine-Grained Model Merging via Modular Expert Recombination

Model merging constructs versatile models by integrating task-specific models without requiring labeled data or expensive joint retraining. Although recent methods improve adaptability to heterogeneous tasks by generating customized merged…

Machine Learning · Computer Science 2026-02-09 Haiyun Qiu , Xingyu Wu , Liang Feng , Kay Chen Tan

Surrogate Benchmarks for Model Merging Optimization

Model merging techniques aim to integrate the abilities of multiple models into a single model. Most model merging techniques have hyperparameters, and their setting affects the performance of the merged model. Because several existing…

Machine Learning · Computer Science 2025-09-03 Rio Akizuki , Yuya Kudo , Nozomu Yoshinari , Yoichi Hirose , Toshiyuki Nishimoto , Kento Uchida , Shinichi Shirakawa