Related papers: AdaMerging: Adaptive Model Merging for Multi-Task …

AdaTT: Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations

Multi-task learning (MTL) aims to enhance the performance and efficiency of machine learning models by simultaneously training them on multiple tasks. However, MTL research faces two challenges: 1) effectively modeling the relationships…

Information Retrieval · Computer Science 2023-06-06 Danwei Li , Zhengyu Zhang , Siyang Yuan , Mingze Gao , Weilin Zhang , Chaofei Yang , Xi Liu , Jiyan Yang

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data. Existing methods attempt to alleviate task conflicts by sparsifying task vectors or promoting orthogonality…

Machine Learning · Computer Science 2025-05-27 Yongxian Wei , Anke Tang , Li Shen , Zixuan Hu , Chun Yuan , Xiaochun Cao

AdaMTL: Adaptive Input-dependent Inference for Efficient Multi-Task Learning

Modern Augmented reality applications require performing multiple tasks on each input frame simultaneously. Multi-task learning (MTL) represents an effective approach where multiple tasks share an encoder to extract representative features…

Computer Vision and Pattern Recognition · Computer Science 2023-04-19 Marina Neseem , Ahmed Agiza , Sherief Reda

DivMerge: A divergence-based model merging method for multi-tasking

Multi-task learning (MTL) is often achieved by merging datasets before fine-tuning, but the growing availability of fine-tuned models has led to new approaches such as model merging via task arithmetic. A major challenge in this setting is…

Machine Learning · Computer Science 2025-09-15 Brahim Touayouch , Loïc Fosse , Géraldine Damnati , Gwénolé Lecorvé

ATM: Improving Model Merging by Alternating Tuning and Merging

Model merging has emerged as a cost-efficient approximation to multitask learning. Among merging strategies, task arithmetic is notable for its simplicity and effectiveness. In this work, we provide a theoretical motivation for task vectors…

Machine Learning · Computer Science 2025-08-11 Luca Zhou , Daniele Solombrino , Donato Crisostomi , Maria Sofia Bucarelli , Fabrizio Silvestri , Emanuele Rodolà

Merging Smarter, Generalizing Better: Enhancing Model Merging on OOD Data

Multi-task learning (MTL) concurrently trains a model on diverse task datasets to exploit common features, thereby improving overall performance across the tasks. Recent studies have dedicated efforts to merging multiple independent model…

Machine Learning · Computer Science 2025-06-16 Bingjie Zhang , Hongkang Li , Changlong Shi , Guowei Rong , He Zhao , Dongsheng Wang , Dandan Guo , Meng Wang

AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning

Multi-task learning (MTL) models have demonstrated impressive results in computer vision, natural language processing, and recommender systems. Even though many approaches have been proposed, how well these approaches balance different…

Machine Learning · Computer Science 2024-05-06 Enneng Yang , Junwei Pan , Ximei Wang , Haibin Yu , Li Shen , Xihua Chen , Lei Xiao , Jie Jiang , Guibing Guo

Multi-Task Model Merging via Adaptive Weight Disentanglement

Model merging has recently gained attention as an economical and scalable approach to incorporate task-specific weights from various tasks into a unified multi-task model. For example, in Task Arithmetic (TA), adding the fine-tuned weights…

Machine Learning · Computer Science 2025-01-10 Feng Xiong , Runxi Cheng , Wang Chen , Zhanqiu Zhang , Yiwen Guo , Chun Yuan , Ruifeng Xu

A Brief Review of Deep Multi-task Learning and Auxiliary Task Learning

Multi-task learning (MTL) optimizes several learning tasks simultaneously and leverages their shared information to improve generalization and the prediction of the model for each task. Auxiliary tasks can be added to the main task to…

Machine Learning · Computer Science 2020-07-03 Partoo Vafaeikia , Khashayar Namdar , Farzad Khalvati

MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic

The advent of large language models (LLMs) like GPT-4 has catalyzed the exploration of multi-task learning (MTL), in which a single model demonstrates proficiency across diverse tasks. Task arithmetic has emerged as a cost-effective…

Computation and Language · Computer Science 2024-06-28 Yuyan Zhou , Liang Song , Bingning Wang , Weipeng Chen

Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models…

Computation and Language · Computer Science 2024-10-15 Zhenyi Lu , Chenghao Fan , Wei Wei , Xiaoye Qu , Dangyang Chen , Yu Cheng

AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

Model merging has emerged as a promising approach for unifying independently fine-tuned models into an integrated framework, significantly enhancing computational efficiency in multi-task learning. Recently, several SVD-based techniques…

Machine Learning · Computer Science 2026-03-03 Chanhyuk Lee , Jiho Choi , Chanryeol Lee , Donggyun Kim , Seunghoon Hong

An Empirical Study of Multimodal Model Merging

Model merging (e.g., via interpolation or task arithmetic) fuses multiple models trained on different tasks to generate a multi-task solution. The technique has been proven successful in previous studies, where the models are trained on…

Computer Vision and Pattern Recognition · Computer Science 2023-10-12 Yi-Lin Sung , Linjie Li , Kevin Lin , Zhe Gan , Mohit Bansal , Lijuan Wang

MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in Code LLMs for Automated Program Repair

Large Language Models (LLMs) have shown high capabilities in several software development-related tasks such as program repair, documentation, code refactoring, debugging, and testing. However, training these models requires massive amount…

Software Engineering · Computer Science 2025-06-10 Meghdad Dehghan , Jie JW Wu , Fatemeh H. Fard , Ali Ouni

SE-Merging: A Self-Enhanced Approach for Dynamic Model Merging

Model merging has gained increasing attention due to its intriguing property: interpolating the parameters of different task-specific fine-tuned models leads to multi-task abilities. However, despite its empirical success, the underlying…

Artificial Intelligence · Computer Science 2025-06-24 Zijun Chen , Zhanpeng Zhou , Bo Zhang , Weinan Zhang , Xi Sun , Junchi Yan

Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation

Multi-task learning (MTL) aims to improve the generalization of several related tasks by learning them jointly. As a comparison, in addition to the joint training scheme, modern meta-learning allows unseen tasks with limited labels during…

Machine Learning · Computer Science 2021-06-17 Haoxiang Wang , Han Zhao , Bo Li

Auxiliary Learning for Deep Multi-task Learning

Multi-task learning (MTL) is an efficient solution to solve multiple tasks simultaneously in order to get better speed and performance than handling each single-task in turn. The most current methods can be categorized as either: (i) hard…

Computer Vision and Pattern Recognition · Computer Science 2019-12-02 Yifan Liu , Bohan Zhuang , Chunhua Shen , Hao Chen , Wei Yin

Efficient Multi-Task Modeling through Automated Fusion of Trained Models

Although multi-task learning is widely applied in intelligent services, traditional multi-task modeling methods often require customized designs based on specific task combinations, resulting in a cumbersome modeling process. Inspired by…

Machine Learning · Computer Science 2025-04-15 Jingxuan Zhou , Weidong Bao , Ji Wang , Zhengyi Zhong , Dayu Zhang

AIMMerging: Adaptive Iterative Model Merging Using Training Trajectories for Language Model Continual Learning

Continual learning (CL) is essential for deploying large language models (LLMs) in dynamic real-world environments without the need for costly retraining. Recent model merging-based methods have attracted significant attention, but they…

Computation and Language · Computer Science 2025-09-23 Yujie Feng , Jian Li , Xiaoyu Dong , Pengfei Xu , Xiaohui Zhou , Yujia Zhang , Zexin LU , Yasha Wang , Alan Zhao , Xu Chu , Xiao-Ming Wu

OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

Foundation models update slowly due to resource-intensive training, whereas domain-specific models evolve rapidly between releases. Model merging seeks to combine multiple expert models into a single, more capable model, reducing storage…

Artificial Intelligence · Computer Science 2026-03-04 Yongxian Wei , Runxi Cheng , Weike Jin , Enneng Yang , Li Shen , Lu Hou , Sinan Du , Chun Yuan , Xiaochun Cao , Dacheng Tao