English
Related papers

Related papers: TIES-Merging: Resolving Interference When Merging …

200 papers

In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models…

Computation and Language · Computer Science 2024-10-15 Zhenyi Lu , Chenghao Fan , Wei Wei , Xiaoye Qu , Dangyang Chen , Yu Cheng

Large-scale deep learning models with a pretraining-finetuning paradigm have led to a surge of numerous task-specific models fine-tuned from a common pre-trained model. Recently, several research efforts have been made on merging these…

Machine Learning · Computer Science 2025-04-22 Yeoreum Lee , Jinwook Jung , Sungyong Baik

Model merging combines knowledge from task-specific models into a unified multi-task model to avoid joint training on all task data. However, current methods face challenges due to representation bias, which can interfere with tasks…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Marcin Osial , Daniel Marczak , Bartosz Zieliński

Model merging has shown that multitask models can be created by directly combining the parameters of different models that are each specialized on tasks of interest. However, models trained independently on distinct tasks often exhibit…

Machine Learning · Computer Science 2026-03-17 Pratik Ramesh , George Stoica , Arun Iyer , Leshem Choshen , Judy Hoffman

Model merging aims to integrate multiple task-specific fine-tuned models derived from a shared pre-trained checkpoint into a single multi-task model without additional training. Despite extensive research, task interference remains a major…

Machine Learning · Computer Science 2026-02-25 Longhua Li , Lei Qi , Qi Tian , Xin Geng

Recent advances in large language models have led to numerous task-specialized fine-tuned variants, creating a need for efficient model merging techniques that preserve specialized capabilities while avoiding costly retraining. While…

Computation and Language · Computer Science 2025-02-20 Shuqi Liu , Han Wu , Bowei He , Xiongwei Han , Mingxuan Yuan , Linqi Song

While fine-tuning pretrained models has become common practice, these models often underperform outside their specific domains. Recently developed model merging techniques enable the direct integration of multiple models, each fine-tuned…

Computer Vision and Pattern Recognition · Computer Science 2024-10-04 Guodong Du , Junlin Lee , Jing Li , Runhua Jiang , Yifei Guo , Shuyang Yu , Hanting Liu , Sim Kuan Goh , Ho-Kin Tang , Daojing He , Min Zhang

Model merging has gained increasing attention due to its intriguing property: interpolating the parameters of different task-specific fine-tuned models leads to multi-task abilities. However, despite its empirical success, the underlying…

Artificial Intelligence · Computer Science 2025-06-24 Zijun Chen , Zhanpeng Zhou , Bo Zhang , Weinan Zhang , Xi Sun , Junchi Yan

The success of pretrain-finetune paradigm brings about the release of numerous model weights. In this case, merging models finetuned on different tasks to enable a single model with multi-task capabilities is gaining increasing attention…

Machine Learning · Computer Science 2024-09-30 Chenyu Huang , Peng Ye , Tao Chen , Tong He , Xiangyu Yue , Wanli Ouyang

Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data. Existing methods attempt to alleviate task conflicts by sparsifying task vectors or promoting orthogonality…

Machine Learning · Computer Science 2025-05-27 Yongxian Wei , Anke Tang , Li Shen , Zixuan Hu , Chun Yuan , Xiaochun Cao

Model merging aims to combine multiple expert models into a more capable single model, offering benefits such as reduced storage and serving costs, improved generalization, and support for decentralized model development. Despite its…

Machine Learning · Computer Science 2024-10-07 Prateek Yadav , Tu Vu , Jonathan Lai , Alexandra Chronopoulou , Manaal Faruqui , Mohit Bansal , Tsendsuren Munkhdalai

Fine-tuning pre-trained models on targeted datasets enhances task-specific performance but often comes at the expense of generalization. Model merging techniques, which integrate multiple fine-tuned models into a single multi-task model…

Machine Learning · Computer Science 2025-09-11 Zitao Fang , Guodong DU , Shuyang Yu , Yifei Guo , Yiwei Zhang , Yiyao Cao , Jing Li , Ho-Kin Tang , Sim Kuan Goh

Transfer learning has become a powerful tool to initialize deep learning models to achieve faster convergence and higher performance. This is especially useful in the medical imaging analysis domain, where data scarcity limits possible…

Computer Vision and Pattern Recognition · Computer Science 2025-04-16 Ibrahim Almakky , Santosh Sanjeev , Anees Ur Rehman Hashmi , Mohammad Areeb Qazi , Hu Wang , Mohammad Yaqub

Multi-task learning (MTL) is often achieved by merging datasets before fine-tuning, but the growing availability of fine-tuned models has led to new approaches such as model merging via task arithmetic. A major challenge in this setting is…

Machine Learning · Computer Science 2025-09-15 Brahim Touayouch , Loïc Fosse , Géraldine Damnati , Gwénolé Lecorvé

Model merging combines knowledge from separately fine-tuned models, yet the factors driving its success remain poorly understood. While recent work treats mergeability as an intrinsic property of the models, we show with an…

Machine Learning · Computer Science 2026-05-27 Luca Zhou , Bo Zhao , Rose Yu , Emanuele Rodolà

Model merging has achieved significant success, with numerous innovative methods proposed to enhance capabilities by combining multiple models. However, challenges persist due to the lack of a unified framework for classification and…

Machine Learning · Computer Science 2025-03-13 Wei Ruan , Tianze Yang , Yifan Zhou , Tianming Liu , Jin Lu

Model merging constructs versatile models by integrating task-specific models without requiring labeled data or expensive joint retraining. Although recent methods improve adaptability to heterogeneous tasks by generating customized merged…

Machine Learning · Computer Science 2026-02-09 Haiyun Qiu , Xingyu Wu , Liang Feng , Kay Chen Tan

Multi-task model merging offers a promising paradigm for integrating multiple expert models into a unified model without additional training. Existing state-of-the-art techniques, such as Task Arithmetic and its variants, merge models by…

Artificial Intelligence · Computer Science 2025-05-15 Wenju Sun , Qingyong Li , Yangli-ao Geng , Boyang Li

Adapting general-purpose language models to new skills is currently an expensive process that must be repeated as new instruction datasets targeting new skills are created, or can cause the models to forget older skills. In this work, we…

Computation and Language · Computer Science 2024-10-18 Jacob Morrison , Noah A. Smith , Hannaneh Hajishirzi , Pang Wei Koh , Jesse Dodge , Pradeep Dasigi

Model merging offers an effective strategy to combine the strengths of multiple finetuned models into a unified model that preserves the specialized capabilities of each. Existing methods merge models in a global manner, performing…

Machine Learning · Computer Science 2025-01-08 Yifei He , Yuzheng Hu , Yong Lin , Tong Zhang , Han Zhao
‹ Prev 1 2 3 10 Next ›