Related papers: Parameter Efficient Diverse Paraphrase Generation …

Generation-Distillation for Efficient Natural Language Understanding in Low-Data Settings

Over the past year, the emergence of transfer learning with large-scale language models (LM) has led to dramatic performance improvements across a broad range of natural language understanding tasks. However, the size and memory footprint…

Computation and Language · Computer Science 2020-02-04 Luke Melas-Kyriazi , George Han , Celine Liang

Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing

We present Impossible Distillation, a novel framework for paraphrasing and sentence summarization, that distills a high-quality dataset and model from a low-quality teacher that itself cannot perform these tasks. Unlike prior works that…

Computation and Language · Computer Science 2024-08-21 Jaehun Jung , Peter West , Liwei Jiang , Faeze Brahman , Ximing Lu , Jillian Fisher , Taylor Sorensen , Yejin Choi

Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation

Large Language Models (LLMs) demonstrate exceptional reasoning capabilities, often achieving state-of-the-art performance in various tasks. However, their substantial computational and memory demands, due to billions of parameters, hinder…

Computation and Language · Computer Science 2024-11-25 Xunyu Zhu , Jian Li , Can Ma , Weiping Wang

ParaFusion: A Large-Scale LLM-Driven English Paraphrase Dataset Infused with High-Quality Lexical and Syntactic Diversity

Paraphrase generation is a pivotal task in natural language processing (NLP). Existing datasets in the domain lack syntactic and lexical diversity, resulting in paraphrases that closely resemble the source sentences. Moreover, these…

Computation and Language · Computer Science 2024-04-19 Lasal Jayawardena , Prasan Yapa

ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset by AMR Back-Translation

Paraphrase generation is a long-standing task in natural language processing (NLP). Supervised paraphrase generation models, which rely on human-annotated paraphrase pairs, are cost-inefficient and hard to scale up. On the other hand,…

Computation and Language · Computer Science 2023-05-29 Kuan-Hao Huang , Varun Iyer , I-Hung Hsu , Anoop Kumar , Kai-Wei Chang , Aram Galstyan

Paraphrase and Aggregate with Large Language Models for Minimizing Intent Classification Errors

Large language models (LLM) have achieved remarkable success in natural language generation but lesser focus has been given to their applicability in decision making tasks such as classification. We show that LLMs like LLaMa can achieve…

Computation and Language · Computer Science 2024-06-26 Vikas Yadav , Zheng Tang , Vijay Srinivasan

Multi-stage Distillation Framework for Cross-Lingual Semantic Similarity Matching

Previous studies have proved that cross-lingual knowledge distillation can significantly improve the performance of pre-trained models for cross-lingual similarity matching tasks. However, the student model needs to be large in this…

Computation and Language · Computer Science 2022-09-14 Kunbo Ding , Weijie Liu , Yuejian Fang , Zhe Zhao , Qi Ju , Xuefeng Yang

LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models

Knowledge distillation (KD) has been a predominant method for compressing Large Language Models (LLMs). In this paper, we first revisit KD and Low-Rank Adaption (LoRA) and demonstrate that they follow the same paradigm. Inspired by this…

Computation and Language · Computer Science 2025-02-26 Runming Yang , Taiqiang Wu , Jiahao Wang , Pengfei Hu , Yik-Chung Wu , Ngai Wong , Yujiu Yang

Effective Distillation of Table-based Reasoning Ability from LLMs

Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, their enormous parameter size and extremely high requirements for compute power pose challenges for…

Computation and Language · Computer Science 2024-03-26 Bohao Yang , Chen Tang , Kun Zhao , Chenghao Xiao , Chenghua Lin

Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Model

Owing to their powerful semantic reasoning capabilities, Large Language Models (LLMs) have been effectively utilized as recommenders, achieving impressive performance. However, the high inference latency of LLMs significantly restricts…

Information Retrieval · Computer Science 2024-08-21 Yu Cui , Feng Liu , Pengbo Wang , Bohao Wang , Heng Tang , Yi Wan , Jun Wang , Jiawei Chen

Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation

Sequential recommender systems have achieved significant success in modeling temporal user behavior but remain limited in capturing rich user semantics beyond interaction patterns. Large Language Models (LLMs) present opportunities to…

Information Retrieval · Computer Science 2026-04-24 Nikita Severin , Danil Kartushov , Vladislav Urzhumov , Vladislav Kulikov , Oksana Konovalova , Alexey Grishanov , Anton Klenitskiy , Artem Fatkulin , Alexey Vasilev , Andrey Savchenko , Ilya Makarov

RM-Distiller: Exploiting Generative LLM for Reward Model Distillation

Reward models (RMs) play a pivotal role in aligning large language models (LLMs) with human preferences. Due to the difficulty of obtaining high-quality human preference annotations, distilling preferences from generative LLMs has emerged…

Computation and Language · Computer Science 2026-01-21 Hongli Zhou , Hui Huang , Wei Liu , Chenglong Wang , Xingyuan Bu , Lvyuan Han , Fuhai Song , Muyun Yang , Wenhao Jiang , Hailong Cao , Tiejun Zhao

Multi-Granularity Semantic Revision for Large Language Model Distillation

Knowledge distillation plays a key role in compressing the Large Language Models (LLMs), which boosts a small-size student model under large teacher models' guidance. However, existing LLM distillation methods overly rely on…

Computation and Language · Computer Science 2024-07-16 Xiaoyu Liu , Yun Zhang , Wei Li , Simiao Li , Xudong Huang , Hanting Chen , Yehui Tang , Jie Hu , Zhiwei Xiong , Yunhe Wang

An Active Learning Framework for Inclusive Generation by Large Language Models

Ensuring that Large Language Models (LLMs) generate text representative of diverse sub-populations is essential, particularly when key concepts related to under-represented groups are scarce in the training data. We address this challenge…

Computation and Language · Computer Science 2024-12-17 Sabit Hassan , Anthony Sicilia , Malihe Alikhani

Small But Funny: A Feedback-Driven Approach to Humor Distillation

The emergence of Large Language Models (LLMs) has brought to light promising language generation capabilities, particularly in performing tasks like complex reasoning and creative writing. Consequently, distillation through imitation of…

Computation and Language · Computer Science 2024-02-29 Sahithya Ravi , Patrick Huber , Akshat Shrivastava , Aditya Sagar , Ahmed Aly , Vered Shwartz , Arash Einolghozati

A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training

Modern Natural Language Generation (NLG) models come with massive computational and storage requirements. In this work, we study the potential of compressing them, which is crucial for real-world applications serving millions of users. We…

Computation and Language · Computer Science 2023-05-29 Nitay Calderon , Subhabrata Mukherjee , Roi Reichart , Amir Kantor

Decomposable Neural Paraphrase Generation

Paraphrasing exists at different granularity levels, such as lexical level, phrasal level and sentential level. This paper presents Decomposable Neural Paraphrase Generator (DNPG), a Transformer-based model that can learn and generate…

Computation and Language · Computer Science 2019-06-25 Zichao Li , Xin Jiang , Lifeng Shang , Qun Liu

Improved Paraphrase Generation via Controllable Latent Diffusion

Paraphrase generation strives to generate high-quality and diverse expressions of a given text, a domain where diffusion models excel. Though SOTA diffusion generation reconciles generation quality and diversity, textual diffusion suffers…

Computation and Language · Computer Science 2025-01-20 Wei Zou , Ziyuan Zhuang , Xiang Geng , Shujian Huang , Jia Liu , Jiajun Chen

Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions

The exponential growth of Large Language Models (LLMs) continues to highlight the need for efficient strategies to meet ever-expanding computational and data demands. This survey provides a comprehensive analysis of two complementary…

Computation and Language · Computer Science 2026-01-06 Luyang Fang , Xiaowei Yu , Jiazhang Cai , Yongkai Chen , Shushan Wu , Zhengliang Liu , Zhenyuan Yang , Haoran Lu , Xilin Gong , Yufang Liu , Terry Ma , Wei Ruan , Ali Abbasi , Jing Zhang , Tao Wang , Ehsan Latif , Weihang You , Hanqi Jiang , Wei Liu , Wei Zhang , Soheil Kolouri , Xiaoming Zhai , Dajiang Zhu , Wenxuan Zhong , Tianming Liu , Ping Ma

Sequence-Level Knowledge Distillation

Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper…

Computation and Language · Computer Science 2016-09-23 Yoon Kim , Alexander M. Rush