Related papers: Information-Theoretic Requirements for Gradient-Ba…

Imbalanced Gradients in RL Post-Training of Multi-Task LLMs

Multi-task post-training of large language models (LLMs) is typically performed by mixing datasets from different tasks and optimizing them jointly. This approach implicitly assumes that all tasks contribute gradients of similar magnitudes;…

Machine Learning · Computer Science 2025-10-28 Runzhe Wu , Ankur Samanta , Ayush Jain , Scott Fujimoto , Jeongyeol Kwon , Ben Kretzu , Youliang Yu , Kaveh Hassani , Boris Vidolov , Yonathan Efroni

Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation

Multi-task learning (MTL) aims to improve the generalization of several related tasks by learning them jointly. As a comparison, in addition to the joint training scheme, modern meta-learning allows unseen tasks with limited labels during…

Machine Learning · Computer Science 2021-06-17 Haoxiang Wang , Han Zhao , Bo Li

"It's a Match!" -- A Benchmark of Task Affinity Scores for Joint Learning

While the promises of Multi-Task Learning (MTL) are attractive, characterizing the conditions of its success is still an open problem in Deep Learning. Some tasks may benefit from being learned together while others may be detrimental to…

Machine Learning · Computer Science 2023-01-10 Raphael Azorin , Massimo Gallo , Alessandro Finamore , Dario Rossi , Pietro Michiardi

Revisit the Imbalance Optimization in Multi-task Learning: An Experimental Analysis

Multi-task learning (MTL) aims to build general-purpose vision systems by training a single network to perform multiple tasks jointly. While promising, its potential is often hindered by "unbalanced optimization", where task interference…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Yihang Guo , Tianyuan Yu , Liang Bai , Yanming Guo , Yirun Ruan , William Li , Weishi Zheng

Scalable Multitask Learning Using Gradient-based Estimation of Task Affinity

Multitask learning is a widely used paradigm for training models on diverse tasks, with applications ranging from graph neural networks to language model fine-tuning. Since tasks may interfere with each other, a key notion for modeling…

Machine Learning · Computer Science 2024-11-22 Dongyue Li , Aneesh Sharma , Hongyang R. Zhang

Multitask Learning with Single Gradient Step Update for Task Balancing

Multitask learning is a methodology to boost generalization performance and also reduce computational intensity and memory usage. However, learning multiple tasks simultaneously can be more difficult than learning a single task because it…

Machine Learning · Computer Science 2020-06-03 Sungjae Lee , Youngdoo Son

Asynchronous Multi-Task Learning

Many real-world machine learning applications involve several learning tasks which are inter-related. For example, in healthcare domain, we need to learn a predictive model of a certain disease for many hospitals. The models for each…

Machine Learning · Computer Science 2016-10-03 Inci M. Baytas , Ming Yan , Anil K. Jain , Jiayu Zhou

Distribution Matching for Heterogeneous Multi-Task Learning: a Large-scale Face Study

Multi-Task Learning has emerged as a methodology in which multiple tasks are jointly learned by a shared learning algorithm, such as a DNN. MTL is based on the assumption that the tasks under consideration are related; therefore it exploits…

Computer Vision and Pattern Recognition · Computer Science 2021-05-11 Dimitrios Kollias , Viktoriia Sharmanska , Stefanos Zafeiriou

Accelerating Meta-Learning by Sharing Gradients

The success of gradient-based meta-learning is primarily attributed to its ability to leverage related tasks to learn task-invariant information. However, the absence of interactions between different tasks in the inner loop leads to…

Machine Learning · Computer Science 2023-12-15 Oscar Chang , Hod Lipson

Learning Multi-Tasks with Inconsistent Labels by using Auxiliary Big Task

Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks. Existing MTL works mainly focus on the scenario where label sets among multiple tasks (MTs) are usually the same,…

Machine Learning · Computer Science 2022-01-10 Quan Feng , Songcan Chen

GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks

Deep multitask networks, in which one neural network produces multiple predictive outputs, can offer better speed and performance than their single-task counterparts but are challenging to train properly. We present a gradient normalization…

Computer Vision and Pattern Recognition · Computer Science 2018-07-16 Zhao Chen , Vijay Badrinarayanan , Chen-Yu Lee , Andrew Rabinovich

Leveraging convergence behavior to balance conflicting tasks in multi-task learning

Multi-Task Learning is a learning paradigm that uses correlated tasks to improve performance generalization. A common way to learn multiple tasks is through the hard parameter sharing approach, in which a single architecture is used to…

Machine Learning · Computer Science 2022-04-15 Angelica Tiemi Mizuno Nakamura , Denis Fernando Wolf , Valdir Grassi

A Multi-Task Learning Approach to Linear Multivariate Forecasting

Accurate forecasting of multivariate time series data is important in many engineering and scientific applications. Recent state-of-the-art works ignore the inter-relations between variates, using their model on each variate independently.…

Machine Learning · Computer Science 2025-03-18 Liran Nochumsohn , Hedi Zisling , Omri Azencot

Regularizing Deep Multi-Task Networks using Orthogonal Gradients

Deep neural networks are a promising approach towards multi-task learning because of their capability to leverage knowledge across domains and learn general purpose representations. Nevertheless, they can fail to live up to these promises…

Machine Learning · Computer Science 2019-12-17 Mihai Suteu , Yike Guo

Ensemble Prediction of Task Affinity for Efficient Multi-Task Learning

A fundamental problem in multi-task learning (MTL) is identifying groups of tasks that should be learned together. Since training MTL models for all possible combinations of tasks is prohibitively expensive for large task sets, a crucial…

Machine Learning · Computer Science 2026-02-24 Afiya Ayman , Ayan Mukhopadhyay , Aron Laszka

LDC-MTL: Balancing Multi-Task Learning through Scalable Loss Discrepancy Control

Multi-task learning (MTL) has been widely adopted for its ability to simultaneously learn multiple tasks. While existing gradient manipulation methods often yield more balanced solutions than simple scalarization-based approaches, they…

Machine Learning · Computer Science 2025-09-29 Peiyao Xiao , Chaosheng Dong , Shaofeng Zou , Kaiyi Ji

Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace

Gradient-based meta-learning methods leverage gradient descent to learn the commonalities among various tasks. While previous such methods have been successful in meta-learning tasks, they resort to simple gradient descent during…

Machine Learning · Statistics 2018-06-15 Yoonho Lee , Seungjin Choi

Learning Task Grouping and Overlap in Multi-task Learning

In the paradigm of multi-task learning, mul- tiple related prediction tasks are learned jointly, sharing information across the tasks. We propose a framework for multi-task learn- ing that enables one to selectively share the information…

Machine Learning · Computer Science 2012-07-03 Abhishek Kumar , Hal Daume

Cross-Task Consistency Learning Framework for Multi-Task Learning

Multi-task learning (MTL) is an active field in deep learning in which we train a model to jointly learn multiple tasks by exploiting relationships between the tasks. It has been shown that MTL helps the model share the learned features…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Akihiro Nakano , Shi Chen , Kazuyuki Demachi

Examining Common Paradigms in Multi-Task Learning

While multi-task learning (MTL) has gained significant attention in recent years, its underlying mechanisms remain poorly understood. Recent methods did not yield consistent performance improvements over single task learning (STL)…

Machine Learning · Computer Science 2024-08-16 Cathrin Elich , Lukas Kirchdorfer , Jan M. Köhler , Lukas Schott