English
Related papers

Related papers: Logic Distillation: Learning from Code Function by…

200 papers

Large Language Models (LLMs) demonstrate exceptional reasoning capabilities, often achieving state-of-the-art performance in various tasks. However, their substantial computational and memory demands, due to billions of parameters, hinder…

Computation and Language · Computer Science 2024-11-25 Xunyu Zhu , Jian Li , Can Ma , Weiping Wang

Large Language Models (LLMs) have displayed remarkable performances across various complex tasks by leveraging Chain-of-Thought (CoT) prompting. Recently, studies have proposed a Knowledge Distillation (KD) approach, reasoning distillation,…

Computation and Language · Computer Science 2024-10-14 Hojae Lee , Junho Kim , SangKeun Lee

Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge. However, deployment of the LLMs in real-world applications can be challenging due to…

Computation and Language · Computer Science 2023-10-31 Minki Kang , Seanie Lee , Jinheon Baek , Kenji Kawaguchi , Sung Ju Hwang

Large language models (LLMs) have showcased remarkable capabilities in complex reasoning through chain of thought (CoT) prompting. Recently, there has been a growing interest in transferring these reasoning abilities from LLMs to smaller…

Computation and Language · Computer Science 2023-12-21 Hongzhan Chen , Siyue Wu , Xiaojun Quan , Rui Wang , Ming Yan , Ji Zhang

Despite the advanced intelligence abilities of large language models (LLMs) in various applications, they still face significant computational and storage demands. Knowledge Distillation (KD) has emerged as an effective strategy to improve…

Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, their enormous parameter size and extremely high requirements for compute power pose challenges for…

Computation and Language · Computer Science 2024-03-26 Bohao Yang , Chen Tang , Kun Zhao , Chenghao Xiao , Chenghua Lin

Large Language Models (LLMs) have showcased exceptional capabilities in various domains, attracting significant interest from both academia and industry. Despite their impressive performance, the substantial size and computational demands…

Computation and Language · Computer Science 2024-07-03 Chuanpeng Yang , Wang Lu , Yao Zhu , Yidong Wang , Qian Chen , Chenlong Gao , Bingjie Yan , Yiqiang Chen

Knowledge Distillation (KD) is a promising technique for reducing the high computational demand of large language models (LLMs). However, previous KD methods are primarily applied to white-box classification models or training small models…

Computation and Language · Computer Science 2026-02-03 Yuxian Gu , Li Dong , Furu Wei , Minlie Huang

LLMs are increasingly explored for bundle generation, thanks to their reasoning capabilities and knowledge. However, deploying large-scale LLMs introduces significant efficiency challenges, primarily high computational costs during…

Computation and Language · Computer Science 2025-04-25 Kaidong Feng , Zhu Sun , Jie Yang , Hui Fang , Xinghua Qu , Wenyuan Liu

The exponential growth of Large Language Models (LLMs) continues to highlight the need for efficient strategies to meet ever-expanding computational and data demands. This survey provides a comprehensive analysis of two complementary…

In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities from leading proprietary LLMs, such as GPT-4, to their open-source counterparts like LLaMA and…

Computation and Language · Computer Science 2024-10-22 Xiaohan Xu , Ming Li , Chongyang Tao , Tao Shen , Reynold Cheng , Jinyang Li , Can Xu , Dacheng Tao , Tianyi Zhou

Large language models (LLMs) provide a promising way for accurate session-based recommendation (SBR), but they demand substantial computational time and memory. Knowledge distillation (KD)-based methods can alleviate these issues by…

Information Retrieval · Computer Science 2025-02-25 Yingpeng Du , Zhu Sun , Ziyan Wang , Haoyan Chua , Jie Zhang , Yew-Soon Ong

Distilling the thinking traces of a Large Language Model (LLM) with reasoning capabilities into a smaller model has been proven effective. Yet, there is a scarcity of work done on how model performances scale with the quantity of…

Computation and Language · Computer Science 2025-10-08 Muyu He , Muhammad Ali Shafique , Anand Kumar , Tsach Mackey , Nazneen Rajani

Large language models (LLMs) have achieved remarkable advancements in natural language processing. However, the massive scale and computational demands of these models present formidable challenges when considering their practical…

Computation and Language · Computer Science 2024-04-09 Weize Liu , Guocong Li , Kai Zhang , Bang Du , Qiyuan Chen , Xuming Hu , Hongxia Xu , Jintai Chen , Jian Wu

Knowledge distillation (KD) is a powerful paradigm for compressing large language models (LLMs), whose effectiveness depends on intertwined choices of divergence direction, optimization strategy, and data regime. We break down the design of…

Computation and Language · Computer Science 2026-04-23 Wenhong Zhu , Ruobing Xie , Rui Wang , Pengfei Liu

Large Language Models (LLMs) can transfer their reasoning skills to smaller models by teaching them to generate the intermediate reasoning process required to solve multistep reasoning tasks. While LLMs can accurately solve reasoning tasks…

Artificial Intelligence · Computer Science 2024-10-25 Shivam Adarsh , Kumar Shridhar , Caglar Gulcehre , Nicholas Monath , Mrinmaya Sachan

Deploying accurate Text-to-SQL systems at the enterprise level faces a difficult trilemma involving cost, security and performance. Current solutions force enterprises to choose between expensive, proprietary Large Language Models (LLMs)…

Computation and Language · Computer Science 2026-03-13 Khushboo Thaker , Yony Bresler

Large language models (LLMs) excel at complex reasoning tasks but remain computationally expensive, limiting their practical deployment. To address this, recent works have focused on distilling reasoning capabilities into smaller language…

Computation and Language · Computer Science 2025-11-06 Minki Kang , Jongwon Jeong , Seanie Lee , Jaewoong Cho , Sung Ju Hwang

Pre-trained language models (PLMs) have emerged as powerful tools for code understanding. However, deploying these PLMs in large-scale applications faces practical challenges due to their computational intensity and inference latency.…

Software Engineering · Computer Science 2025-08-22 Ruiqi Wang , Zezhou Yang , Cuiyun Gao , Xin Xia , Qing Liao

Large Language Models (LLMs) have recently made significant advances in code generation through the 'Chain-of-Thought' prompting technique. This technique empowers the model to autonomously devise "solution plans" to tackle intricate…

Software Engineering · Computer Science 2024-03-21 Zhihong Sun , Chen Lyu , Bolun Li , Yao Wan , Hongyu Zhang , Ge Li , Zhi Jin
‹ Prev 1 2 3 10 Next ›