Related papers: Optimistic Global Function Merger

Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration

Multi-task model merging aims to consolidate knowledge from multiple fine-tuned task-specific experts into a unified model while minimizing performance degradation. Existing methods primarily approach this by minimizing differences between…

Machine Learning · Computer Science 2025-10-28 Wenju Sun , Qingyong Li , Wen Wang , Yang Liu , Yangli-ao Geng , Boyang Li

Multi-task Code LLMs: Data Mix or Model Merge?

Recent research advocates deploying smaller, specialized code LLMs in agentic frameworks alongside frontier models, sparking interest in efficient strategies for multi-task learning that balance performance, constraints, and costs. We…

Computation and Language · Computer Science 2026-01-30 Mingzhi Zhu , Boris Sobolev , Rahul Krishna , Raju Pavuluri , Stacy Patterson , Michele Merler

Function-constrained Program Synthesis

This work introduces (1) a technique that allows large language models (LLMs) to leverage user-provided code when solving programming tasks and (2) a method to iteratively generate modular sub-functions that can aid future code generation…

Machine Learning · Computer Science 2023-12-05 Patrick Hajali , Ignas Budvytis

Navigating the Accuracy-Size Trade-Off with Flexible Model Merging

Model merging has emerged as an efficient method to combine multiple single-task fine-tuned models. The merged model can enjoy multi-task capabilities without expensive training. While promising, merging into a single model often suffers…

Computer Vision and Pattern Recognition · Computer Science 2026-04-15 Akash Dhasade , Divyansh Jhunjhunwala , Milos Vujasinovic , Gauri Joshi , Anne-Marie Kermarrec

Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models…

Computation and Language · Computer Science 2024-10-15 Zhenyi Lu , Chenghao Fan , Wei Wei , Xiaoye Qu , Dangyang Chen , Yu Cheng

Model Merging by Output-Space Projection

Model merging combines fine-tuned checkpoints into a single multi-task model without retraining. Existing methods - such as task arithmetic, model soups, TIES, and DARE - are computationally efficient and empirically successful, but rely on…

Machine Learning · Computer Science 2026-05-29 Bethan Evans , Benjamin Etheridge , Stephen Roberts , Jared Tanner

Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic

Model merging offers an effective strategy to combine the strengths of multiple finetuned models into a unified model that preserves the specialized capabilities of each. Existing methods merge models in a global manner, performing…

Machine Learning · Computer Science 2025-01-08 Yifei He , Yuzheng Hu , Yong Lin , Tong Zhang , Han Zhao

FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization

Model merging has emerged as a promising approach for multi-task learning (MTL), offering a data-efficient alternative to conventional fine-tuning. However, with the rapid development of the open-source AI ecosystem and the increasing…

Machine Learning · Computer Science 2025-10-01 Hao Mark Chen , Shell Xu Hu , Wayne Luk , Timothy Hospedales , Hongxiang Fan

Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging

Reasoning capabilities represent a critical frontier for large language models (LLMs), but developing them requires extensive proprietary datasets and computational resources. One way to efficiently supplement capabilities with is by model…

Artificial Intelligence · Computer Science 2025-06-26 Guinan Su , Jonas Geiping

Global optimization tailored for graphics processing units: Complete and rigorous search for large-scale nonlinear minimization

This paper introduces a numerical method to enclose the global minimum of a nonlinear function subject to simple bounds on the variables. Using interval analysis, coupled with the computational power and architecture of graphics processing…

Numerical Analysis · Mathematics 2026-04-15 Guanglu Zhang , Qihang Shan , Jonathan Cagan

Realistic Evaluation of Model Merging for Compositional Generalization

Merging has become a widespread way to cheaply combine individual models into a single model that inherits their capabilities and attains better performance. This popularity has spurred rapid development of many new merging methods, which…

Machine Learning · Computer Science 2024-09-30 Derek Tam , Yash Kant , Brian Lester , Igor Gilitschenski , Colin Raffel

Optimizing Function Layout for Mobile Applications

Function layout, also referred to as function reordering or function placement, is one of the most effective profile-guided compiler optimizations. By reordering functions in a binary, compilers are able to greatly improve the performance…

Programming Languages · Computer Science 2022-11-18 Ellis Hoag , Kyungwoo Lee , Julián Mestre , Sergey Pupyrev

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data. Existing methods attempt to alleviate task conflicts by sparsifying task vectors or promoting orthogonality…

Machine Learning · Computer Science 2025-05-27 Yongxian Wei , Anke Tang , Li Shen , Zixuan Hu , Chun Yuan , Xiaochun Cao

Fusionize: Improving Serverless Application Performance through Feedback-Driven Function Fusion

Serverless computing increases developer productivity by removing operational concerns such as managing hardware or software runtimes. Developers, however, still need to partition their application into functions, which can be error-prone…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-16 Trever Schirmer , Joel Scheuner , Tobias Pfandzelter , David Bermbach

An Empirical Study of Multimodal Model Merging

Model merging (e.g., via interpolation or task arithmetic) fuses multiple models trained on different tasks to generate a multi-task solution. The technique has been proven successful in previous studies, where the models are trained on…

Computer Vision and Pattern Recognition · Computer Science 2023-10-12 Yi-Lin Sung , Linjie Li , Kevin Lin , Zhe Gan , Mohit Bansal , Lijuan Wang

Fusing Gathers with Integer Linear Programming

We present an Integer Linear Programming based approach to finding the optimal fusion strategy for combinator-based parallel programs. While combinator-based languages or libraries provide a convenient interface for programming parallel…

Programming Languages · Computer Science 2024-07-19 David van Balen , Gabriele Keller , Ivo Gabede Wolff , Trevor L. McDonell

What Matters for Model Merging at Scale?

Model merging aims to combine multiple expert models into a more capable single model, offering benefits such as reduced storage and serving costs, improved generalization, and support for decentralized model development. Despite its…

Machine Learning · Computer Science 2024-10-07 Prateek Yadav , Tu Vu , Jonathan Lai , Alexandra Chronopoulou , Manaal Faruqui , Mohit Bansal , Tsendsuren Munkhdalai

OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

Foundation models update slowly due to resource-intensive training, whereas domain-specific models evolve rapidly between releases. Model merging seeks to combine multiple expert models into a single, more capable model, reducing storage…

Artificial Intelligence · Computer Science 2026-03-04 Yongxian Wei , Runxi Cheng , Weike Jin , Enneng Yang , Li Shen , Lu Hou , Sinan Du , Chun Yuan , Xiaochun Cao , Dacheng Tao

No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces

Model merging integrates the weights of multiple task-specific models into a single multi-task model. Despite recent interest in the problem, a significant performance gap between the combined and single-task models remains. In this paper,…

Machine Learning · Computer Science 2025-06-12 Daniel Marczak , Simone Magistri , Sebastian Cygert , Bartłomiej Twardowski , Andrew D. Bagdanov , Joost van de Weijer

Channel Merging: Preserving Specialization for Merged Experts

Lately, the practice of utilizing task-specific fine-tuning has been implemented to improve the performance of large language models (LLM) in subsequent tasks. Through the integration of diverse LLMs, the overall competency of LLMs is…

Computation and Language · Computer Science 2024-12-23 Mingyang Zhang , Jing Liu , Ganggui Ding , Xinyi Yu , Linlin Ou , Bohan Zhuang