English
Related papers

Related papers: Sequential Reptile: Inter-Task Gradient Alignment …

200 papers

Multilingual generative models obtain remarkable cross-lingual in-context learning capabilities through pre-training on large-scale corpora. However, they still exhibit a performance bias toward high-resource languages and learn isolated…

Computation and Language · Computer Science 2024-06-13 Chong Li , Shaonan Wang , Jiajun Zhang , Chengqing Zong

Multitask learning is a methodology to boost generalization performance and also reduce computational intensity and memory usage. However, learning multiple tasks simultaneously can be more difficult than learning a single task because it…

Machine Learning · Computer Science 2020-06-03 Sungjae Lee , Youngdoo Son

This paper presents a novel optimization method for maximizing generalization over tasks in meta-learning. The goal of meta-learning is to learn a model for an agent adapting rapidly when presented with previously unseen tasks. Tasks are…

Machine Learning · Computer Science 2018-10-19 Amir Erfan Eshratifar , David Eigen , Massoud Pedram

While large language models demonstrate remarkable capabilities at task-specific applications through fine-tuning, extending these benefits across diverse languages is essential for broad accessibility. However, effective cross-lingual…

Computation and Language · Computer Science 2025-06-03 Danni Liu , Jan Niehues

Multi-task post-training of large language models (LLMs) is typically performed by mixing datasets from different tasks and optimizing them jointly. This approach implicitly assumes that all tasks contribute gradients of similar magnitudes;…

Deep neural networks achieve state-of-the-art and sometimes super-human performance across various domains. However, when learning tasks sequentially, the networks easily forget the knowledge of previous tasks, known as "catastrophic…

Computer Vision and Pattern Recognition · Computer Science 2021-05-18 Shixiang Tang , Dapeng Chen , Jinguo Zhu , Shijie Yu , Wanli Ouyang

Multimodal continual instruction tuning enables multimodal large language models to sequentially adapt to new tasks while building upon previously acquired knowledge. However, this continual learning paradigm faces the significant challenge…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Songze Li , Mingyu Gao , Tonghua Su , Xu-Yao Zhang , Zhongjie Wang

In industrial recommendation systems, multi-task learning (learning multiple tasks simultaneously on a single model) is a predominant approach to save training/serving resources and improve recommendation performance via knowledge transfer…

Information Retrieval · Computer Science 2024-11-20 Yun He , Xuxing Chen , Jiayi Xu , Renqin Cai , Yiling You , Jennifer Cao , Minhui Huang , Liu Yang , Yiqun Liu , Xiaoyi Liu , Rong Jin , Sem Park , Bo Long , Xue Feng

Multi-Task Learning (MTL) aims to enhance the model generalization by sharing representations between related tasks for better performance. Typical MTL methods are jointly trained with the complete multitude of ground-truths for all tasks…

Computer Vision and Pattern Recognition · Computer Science 2021-10-15 Yufeng Wang , Yi-Hsuan Tsai , Wei-Chih Hung , Wenrui Ding , Shuo Liu , Ming-Hsuan Yang

Recently, fine-tuning pre-trained language models (e.g., multilingual BERT) to downstream cross-lingual tasks has shown promising results. However, the fine-tuning process inevitably changes the parameters of the pre-trained model and…

Computation and Language · Computer Science 2020-10-06 Zihan Liu , Genta Indra Winata , Andrea Madotto , Pascale Fung

Multi-task learning (MTL) has been widely applied in online advertising and recommender systems. To address the negative transfer issue, recent studies have proposed optimization methods that thoroughly focus on the gradient alignment of…

Information Retrieval · Computer Science 2023-03-13 Xuanhua Yang , Jianxin Zhao , Shaoguo Liu , Liang Wang , Bo Zheng

Multi-Task Learning is a learning paradigm that uses correlated tasks to improve performance generalization. A common way to learn multiple tasks is through the hard parameter sharing approach, in which a single architecture is used to…

Machine Learning · Computer Science 2022-04-15 Angelica Tiemi Mizuno Nakamura , Denis Fernando Wolf , Valdir Grassi

Multilingual BERT (mBERT) has shown reasonable capability for zero-shot cross-lingual transfer when fine-tuned on downstream tasks. Since mBERT is not pre-trained with explicit cross-lingual supervision, transfer performance can further be…

Computation and Language · Computer Science 2020-10-01 Saurabh Kulshreshtha , José Luis Redondo-García , Ching-Yun Chang

Fine-tuning pre-trained generative language models to down-stream language generation tasks has shown promising results. However, this comes with the cost of having a single, large model for each task, which is not ideal in low-memory/power…

Computation and Language · Computer Science 2020-09-22 Zhaojiang Lin , Andrea Madotto , Pascale Fung

Many real-world machine learning applications involve several learning tasks which are inter-related. For example, in healthcare domain, we need to learn a predictive model of a certain disease for many hospitals. The models for each…

Machine Learning · Computer Science 2016-10-03 Inci M. Baytas , Ming Yan , Anil K. Jain , Jiayu Zhou

Without any explicit cross-lingual training data, multilingual language models can achieve cross-lingual transfer. One common way to improve this transfer is to perform realignment steps before fine-tuning, i.e., to train the model to build…

Computation and Language · Computer Science 2023-06-06 Félix Gaschi , Patricio Cerda , Parisa Rastin , Yannick Toussaint

Meta-learning stands for 'learning to learn' such that generalization to new tasks is achieved. Among these methods, Gradient-based meta-learning algorithms are a specific sub-class that excel at quick adaptation to new tasks with limited…

Machine Learning · Computer Science 2020-10-20 Jathushan Rajasegaran , Salman Khan , Munawar Hayat , Fahad Shahbaz Khan , Mubarak Shah

Multi-task learning (MTL) aims to improve the generalization of several related tasks by learning them jointly. As a comparison, in addition to the joint training scheme, modern meta-learning allows unseen tasks with limited labels during…

Machine Learning · Computer Science 2021-06-17 Haoxiang Wang , Han Zhao , Bo Li

Large language models demonstrate reasonable multilingual abilities, despite predominantly English-centric pretraining. However, the spontaneous multilingual alignment in these models is shown to be weak, leading to unsatisfactory…

Computation and Language · Computer Science 2024-11-19 Jiahuan Li , Shujian Huang , Aarron Ching , Xinyu Dai , Jiajun Chen

Training large language models (LLMs) typically involves pre-training on massive corpora, only to restart the process entirely when new data becomes available. A more efficient and resource-conserving approach would be continual…

‹ Prev 1 2 3 10 Next ›