Related papers: Model Merging: Foundations and Algorithms

Training-free Heterogeneous Model Merging

Model merging has attracted significant attention as a powerful paradigm for model reuse, facilitating the integration of task-specific models into a singular, versatile framework endowed with multifarious capabilities. Previous studies,…

Machine Learning · Computer Science 2025-01-03 Zhengqi Xu , Han Zheng , Jie Song , Li Sun , Mingli Song

Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions

Model merging combines the parameters of multiple neural networks into a single model without additional training. As fine-tuned large language models (LLMs) proliferate, merging offers a computationally efficient alternative to ensembles…

Computation and Language · Computer Science 2026-03-31 Mingyang Song , Mao Zheng

HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models

Model merging is a technique that combines multiple large pretrained models into a single model with enhanced performance and broader task adaptability. It has gained popularity in large pretrained model development due to its ability to…

Machine Learning · Computer Science 2024-09-30 Yu Zhou , Xingyu Wu , Jibin Wu , Liang Feng , Kay Chen Tan

Deep Model Fusion: A Survey

Deep model fusion/merging is an emerging technique that merges the parameters or predictions of multiple deep learning models into a single one. It combines the abilities of different models to make up for the biases and errors of a single…

Machine Learning · Computer Science 2023-09-28 Weishi Li , Yong Peng , Miao Zhang , Liang Ding , Han Hu , Li Shen

Revitalizing the Beginning: Avoiding Storage Dependency for Model Merging in Continual Learning

Model merging provides a compelling paradigm for integrating specialized expertise into a unified multi-task model, a goal that aligns naturally with the sequential knowledge acquisition in continual learning (CL). However, the requirement…

Machine Learning · Computer Science 2026-05-12 Xi Wang , Cheng Deng

Towards Reversible Model Merging For Low-rank Weights

Model merging aims to combine multiple fine-tuned models into a single set of weights that performs well across all source tasks. While prior work has shown that merging can approximate the performance of individual fine-tuned models for…

Machine Learning · Computer Science 2025-10-17 Mohammadsajad Alipour , Mohammad Mohammadi Amiri

Revisiting Weight Averaging for Model Merging

Model merging aims to build a multi-task learner by combining the parameters of individually fine-tuned models without additional training. While a straightforward approach is to average model parameters across tasks, this often results in…

Machine Learning · Computer Science 2025-04-04 Jiho Choi , Donggyun Kim , Chanhyuk Lee , Seunghoon Hong

FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization

Model merging has emerged as a promising approach for multi-task learning (MTL), offering a data-efficient alternative to conventional fine-tuning. However, with the rapid development of the open-source AI ecosystem and the increasing…

Machine Learning · Computer Science 2025-10-01 Hao Mark Chen , Shell Xu Hu , Wayne Luk , Timothy Hospedales , Hongxiang Fan

Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging

Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their specialized capabilities across different tasks and domains. Current model merging techniques focus on merging all…

Machine Learning · Computer Science 2025-01-17 Anke Tang , Enneng Yang , Li Shen , Yong Luo , Han Hu , Bo Du , Dacheng Tao

No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces

Model merging integrates the weights of multiple task-specific models into a single multi-task model. Despite recent interest in the problem, a significant performance gap between the combined and single-task models remains. In this paper,…

Machine Learning · Computer Science 2025-06-12 Daniel Marczak , Simone Magistri , Sebastian Cygert , Bartłomiej Twardowski , Andrew D. Bagdanov , Joost van de Weijer

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data. Existing methods attempt to alleviate task conflicts by sparsifying task vectors or promoting orthogonality…

Machine Learning · Computer Science 2025-05-27 Yongxian Wei , Anke Tang , Li Shen , Zixuan Hu , Chun Yuan , Xiaochun Cao

Model Merging by Output-Space Projection

Model merging combines fine-tuned checkpoints into a single multi-task model without retraining. Existing methods - such as task arithmetic, model soups, TIES, and DARE - are computationally efficient and empirically successful, but rely on…

Machine Learning · Computer Science 2026-05-29 Bethan Evans , Benjamin Etheridge , Stephen Roberts , Jared Tanner

$C^2M^3$: Cycle-Consistent Multi-Model Merging

In this paper, we present a novel data-free method for merging neural networks in weight space. Differently from most existing works, our method optimizes for the permutations of network neurons globally across all layers. This allows us to…

Machine Learning · Computer Science 2024-10-31 Donato Crisostomi , Marco Fumero , Daniele Baieri , Florian Bernard , Emanuele Rodolà

Superpose Task-specific Features for Model Merging

Model merging enables powerful capabilities in neural networks without requiring additional training. In this paper, we introduce a novel perspective on model merging by leveraging the fundamental mechanisms of neural network…

Machine Learning · Computer Science 2025-09-19 Haiquan Qiu , You Wu , Dong Li , Jianmin Guo , Quanming Yao

To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

Fine-tuning pre-trained models on targeted datasets enhances task-specific performance but often comes at the expense of generalization. Model merging techniques, which integrate multiple fine-tuned models into a single multi-task model…

Machine Learning · Computer Science 2025-09-11 Zitao Fang , Guodong DU , Shuyang Yu , Yifei Guo , Yiwei Zhang , Yiyao Cao , Jing Li , Ho-Kin Tang , Sim Kuan Goh

Bridging Training and Merging Through Momentum-Aware Optimization

Training large neural networks and merging task-specific models both exploit low-rank structure and require parameter importance estimation, yet these challenges have been pursued in isolation. Current workflows compute curvature…

Machine Learning · Computer Science 2026-03-30 Alireza Moayedikia , Alicia Troncoso

Model Merging in the Essential Subspace

Model merging aims to integrate multiple task-specific fine-tuned models derived from a shared pre-trained checkpoint into a single multi-task model without additional training. Despite extensive research, task interference remains a major…

Machine Learning · Computer Science 2026-02-25 Longhua Li , Lei Qi , Qi Tian , Xin Geng

GNNMerge: Merging of GNN Models Without Accessing Training Data

Model merging has gained prominence in machine learning as a method to integrate multiple trained models into a single model without accessing the original training data. While existing approaches have demonstrated success in domains such…

Machine Learning · Computer Science 2025-03-28 Vipul Garg , Ishita Thakre , Sayan Ranu

Non-Uniform Parameter-Wise Model Merging

Combining multiple machine learning models has long been a technique for enhancing performance, particularly in distributed settings. Traditional approaches, such as model ensembles, work well, but are expensive in terms of memory and…

Machine Learning · Computer Science 2024-12-23 Albert Manuel Orozco Camacho , Stefan Horoi , Guy Wolf , Eugene Belilovsky

Bayesian Model Merging

Model merging aims to combine multiple task-specific expert models into a single model without joint retraining, offering a practical alternative to multi-task learning when data access or computational budget is limited. Existing methods,…

Machine Learning · Computer Science 2026-05-14 Kaiyang Li , Shaobo Han , Qing Su , Shihao Ji