Related papers: Distillation based Multi-task Learning: A Candidat…

Knowledge Distillation for Multi-task Learning

Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with…

Computer Vision and Pattern Recognition · Computer Science 2020-09-25 Wei-Hong Li , Hakan Bilen

Cross-Task Knowledge Distillation in Multi-Task Recommendation

Multi-task learning (MTL) has been widely used in recommender systems, wherein predicting each type of user feedback on items (e.g, click, purchase) are treated as individual tasks and jointly trained with a unified model. Our key…

Information Retrieval · Computer Science 2022-03-29 Chenxiao Yang , Junwei Pan , Xiaofeng Gao , Tingyu Jiang , Dapeng Liu , Guihai Chen

Developing Multi-Task Recommendations with Long-Term Rewards via Policy Distilled Reinforcement Learning

With the explosive growth of online products and content, recommendation techniques have been considered as an effective tool to overcome information overload, improve user experience, and boost business revenue. In recent years, we have…

Machine Learning · Computer Science 2020-01-28 Xi Liu , Li Li , Ping-Chun Hsieh , Muhe Xie , Yong Ge , Rui Chen

Multi-Task Deep Recommender Systems: A Survey

Multi-task learning (MTL) aims at learning related tasks in a unified model to achieve mutual improvement among tasks considering their shared knowledge. It is an important topic in recommendation due to the demand for multi-task prediction…

Information Retrieval · Computer Science 2023-02-10 Yuhao Wang , Ha Tsz Lam , Yi Wong , Ziru Liu , Xiangyu Zhao , Yichao Wang , Bo Chen , Huifeng Guo , Ruiming Tang

Basic Reading Distillation

Large language models (LLMs) have demonstrated remarkable abilities in various natural language processing areas, but they demand high computation resources which limits their deployment in real-world. Distillation is one technique to solve…

Computation and Language · Computer Science 2025-07-31 Zhi Zhou , Sirui Miao , Xiangyu Duan , Hao Yang , Min Zhang

MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models

Pretrained language models have led to significant performance gains in many NLP tasks. However, the intensive computing resources to train such models remain an issue. Knowledge distillation alleviates this problem by learning a…

Computation and Language · Computer Science 2020-05-04 Linqing Liu , Huan Wang , Jimmy Lin , Richard Socher , Caiming Xiong

From Reasoning LLMs to BERT: A Two-Stage Distillation Framework for Search Relevance

Query-service relevance prediction in e-commerce search systems faces strict latency requirements that prevent the direct application of Large Language Models (LLMs). To bridge this gap, we propose a two-stage reasoning distillation…

Information Retrieval · Computer Science 2026-01-27 Runze Xia , Yupeng Ji , Yuxi Zhou , Haodong Liu , Teng Zhang , Piji Li

Teach model to answer questions after comprehending the document

Multi-choice Machine Reading Comprehension (MRC) is a challenging extension of Natural Language Processing (NLP) that requires the ability to comprehend the semantics and logical relationships between entities in a given text. The MRC task…

Computation and Language · Computer Science 2023-07-19 Ruiqing Sun , Ping Jian

Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding

This paper explores the use of knowledge distillation to improve a Multi-Task Deep Neural Network (MT-DNN) (Liu et al., 2019) for learning text representations across multiple natural language understanding tasks. Although ensemble learning…

Computation and Language · Computer Science 2019-04-23 Xiaodong Liu , Pengcheng He , Weizhu Chen , Jianfeng Gao

Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasksks, Multi Task Learning, Semi-Supervised Learning

We propose a new semi-supervised learning method on face-related tasks based on Multi-Task Learning (MTL) and data distillation. The proposed method exploits multiple datasets with different labels for different-but-related tasks such as…

Computer Vision and Pattern Recognition · Computer Science 2019-07-10 Sepidehsadat Hosseini , Mohammad Amin Shabani , Nam Ik Cho

DDIL: Diversity Enhancing Diffusion Distillation With Imitation Learning

Diffusion models excel at generative modeling (e.g., text-to-image) but sampling requires multiple denoising network passes, limiting practicality. Efforts such as progressive distillation or consistency distillation have shown promise by…

Machine Learning · Computer Science 2025-04-01 Risheek Garrepalli , Shweta Mahajan , Munawar Hayat , Fatih Porikli

Multi-Task Learning for Few-Shot Online Adaptation under Signal Temporal Logic Specifications

Multi-task learning (MTL) seeks to improve the generalized performance of learning specific tasks, exploiting useful information incorporated in related tasks. As a promising area, this paper studies an MTL-based control approach…

Systems and Control · Electrical Eng. & Systems 2024-08-01 Andres Arias , Chuangchuang Sun

Classifying Documents within Multiple Hierarchical Datasets using Multi-Task Learning

Multi-task learning (MTL) is a supervised learning paradigm in which the prediction models for several related tasks are learned jointly to achieve better generalization performance. When there are only a few training examples per task, MTL…

Machine Learning · Computer Science 2017-06-07 Azad Naik , Anveshi Charuvaka , Huzefa Rangwala

Distribution Matching Distillation Meets Reinforcement Learning

Distribution Matching Distillation (DMD) facilitates efficient inference by distilling multi-step diffusion models into few-step variants. Concurrently, Reinforcement Learning (RL) has emerged as a vital tool for aligning generative models…

Computer Vision and Pattern Recognition · Computer Science 2026-03-26 Dengyang Jiang , Dongyang Liu , Zanyi Wang , Qilong Wu , Liuzhuozheng Li , Hengzhuang Li , Xin Jin , David Liu , Changsheng Lu , Zhen Li , Bo Zhang , Mengmeng Wang , Steven Hoi , Peng Gao , Harry Yang

Prototypical Contrastive Learning and Adaptive Interest Selection for Candidate Generation in Recommendations

Deep Candidate Generation plays an important role in large-scale recommender systems. It takes user history behaviors as inputs and learns user and item latent embeddings for candidate generation. In the literature, conventional methods…

Information Retrieval · Computer Science 2022-11-24 Ningning Li , Qunwei Li , Xichen Ding , Shaohu Chen , Wenliang Zhong

Robustly Optimized and Distilled Training for Natural Language Understanding

In this paper, we explore multi-task learning (MTL) as a second pretraining step to learn enhanced universal language representation for transformer language models. We use the MTL enhanced representation across several natural language…

Computation and Language · Computer Science 2021-03-17 Haytham ElFadeel , Stan Peshterliev

Contextual Distillation Model for Diversified Recommendation

The diversity of recommendation is equally crucial as accuracy in improving user experience. Existing studies, e.g., Determinantal Point Process (DPP) and Maximal Marginal Relevance (MMR), employ a greedy paradigm to iteratively select…

Information Retrieval · Computer Science 2024-08-15 Fan Li , Xu Si , Shisong Tang , Dingmin Wang , Kunyan Han , Bing Han , Guorui Zhou , Yang Song , Hechang Chen

Rethinking Position Bias Modeling with Knowledge Distillation for CTR Prediction

Click-through rate (CTR) Prediction is of great importance in real-world online ads systems. One challenge for the CTR prediction task is to capture the real interest of users from their clicked items, which is inherently biased by…

Information Retrieval · Computer Science 2022-04-04 Congcong Liu , Yuejiang Li , Jian Zhu , Xiwei Zhao , Changping Peng , Zhangang Lin , Jingping Shao

Generation-Distillation for Efficient Natural Language Understanding in Low-Data Settings

Over the past year, the emergence of transfer learning with large-scale language models (LM) has led to dramatic performance improvements across a broad range of natural language understanding tasks. However, the size and memory footprint…

Computation and Language · Computer Science 2020-02-04 Luke Melas-Kyriazi , George Han , Celine Liang

Life-long Learning for Multilingual Neural Machine Translation with Knowledge Distillation

A common scenario of Multilingual Neural Machine Translation (MNMT) is that each translation task arrives in a sequential manner, and the training data of previous tasks is unavailable. In this scenario, the current methods suffer heavily…

Computation and Language · Computer Science 2022-12-07 Yang Zhao , Junnan Zhu , Lu Xiang , Jiajun Zhang , Yu Zhou , Feifei Zhai , Chengqing Zong