English
Related papers

Related papers: Efficient Audio Captioning with Encoder-Level Know…

200 papers

Although large foundation models pre-trained by self-supervised learning have achieved state-of-the-art performance in many tasks including automatic speech recognition (ASR), knowledge distillation (KD) is often required in practice to…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-21 Xiaoyu Yang , Qiujia Li , Chao Zhang , Philip C. Woodland

Recently, the advance in deep learning has brought a considerable improvement in the end-to-end speech recognition field, simplifying the traditional pipeline while producing promising results. Among the end-to-end models, the connectionist…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-29 Ji Won Yoon , Beom Jun Woo , Sunghwan Ahn , Hyeonseung Lee , Nam Soo Kim

Knowledge Distillation (KD) is a widespread technique for compressing the knowledge of large models into more compact and efficient models. KD has proved to be highly effective in building well-performing low-complexity Acoustic Scene…

Sound · Computer Science 2025-03-17 Tobias Morocutti , Florian Schmid , Khaled Koutini , Gerhard Widmer

Streaming automatic speech recognition (ASR) models are restricted from accessing future context, which results in worse performance compared to the non-streaming models. To improve the performance of streaming ASR, knowledge distillation…

Computation and Language · Computer Science 2023-09-01 Kyuhong Shim , Jinkyu Lee , Simyung Chang , Kyuwoong Hwang

Deep learning has shown promise in enhancing channel state information (CSI) feedback. However, many studies indicate that better feedback performance often accompanies higher computational complexity. Pursuing better performance-complexity…

Signal Processing · Electrical Eng. & Systems 2024-03-05 Yiming Cui , Jiajia Guo , Zheng Cao , Huaze Tang , Chao-Kai Wen , Shi Jin , Xin Wang , Xiaolin Hou

Device-directed speech detection (DDSD) is a binary classification task that separates the user's queries to a voice assistant (VA) from background speech or side conversations. This is important for achieving naturalistic user experience.…

Augmentation and knowledge distillation (KD) are well-established techniques employed in audio classification tasks, aimed at enhancing performance and reducing model sizes on the widely recognized Audioset (AS) benchmark. Although both…

Sound · Computer Science 2023-09-11 Heinrich Dinkel , Yongqing Wang , Zhiyong Yan , Junbo Zhang , Yujun Wang

Automatically describing audio-visual content with texts, namely video captioning, has received significant attention due to its potential applications across diverse fields. Deep neural networks are the dominant methods, offering…

Audio and Speech Processing · Electrical Eng. & Systems 2023-06-19 Özkan Çaylı , Xubo Liu , Volkan Kılıç , Wenwu Wang

Speech denoising is a generally adopted and impactful task, appearing in many common and everyday-life use cases. Although there are very powerful methods published, most of those are too complex for deployment in everyday and low-resources…

Sound · Computer Science 2025-05-07 Diep Luong , Mikko Heikkinen , Konstantinos Drossos , Tuomas Virtanen

Knowledge Distillation (KD) compresses computationally expensive pre-trained language models (PLMs) by transferring their knowledge to smaller models, allowing their use in resource-constrained or real-time settings. However, most smaller…

Computation and Language · Computer Science 2023-11-08 Hayeon Lee , Rui Hou , Jongpil Kim , Davis Liang , Hongbo Zhang , Sung Ju Hwang , Alexander Min

Transformer encoder with connectionist temporal classification (CTC) framework is widely used for automatic speech recognition (ASR). However, knowledge distillation (KD) for ASR displays a problem of disagreement between teacher-student…

Audio and Speech Processing · Electrical Eng. & Systems 2024-06-13 Eungbeom Kim , Hantae Kim , Kyogu Lee

Knowledge distillation (KD) is a common approach to compress a teacher model to reduce its inference cost and memory footprint, by training a smaller student model. However, in the context of autoregressive language models (LMs), we…

Computation and Language · Computer Science 2024-06-18 Qihuang Zhong , Liang Ding , Li Shen , Juhua Liu , Bo Du , Dacheng Tao

Knowledge distillation (KD) is a technique for transferring knowledge from complex teacher models to simpler student models, significantly enhancing model efficiency and accuracy. It has demonstrated substantial advancements in various…

Computation and Language · Computer Science 2025-04-21 Junjie Yang , Junhao Song , Xudong Han , Ziqian Bi , Tianyang Wang , Chia Xin Liang , Xinyuan Song , Yichao Zhang , Qian Niu , Benji Peng , Keyu Chen , Ming Liu

Knowledge distillation (KD) is the de facto standard for compressing large-scale models into smaller ones. Prior works have explored ever more complex KD strategies involving different objective functions, teacher-ensembles, and weight…

Computer Vision and Pattern Recognition · Computer Science 2025-05-06 Vishaal Udandarao , Nikhil Parthasarathy , Muhammad Ferjad Naeem , Talfan Evans , Samuel Albanie , Federico Tombari , Yongqin Xian , Alessio Tonioni , Olivier J. Hénaff

The smaller memory bandwidth in smart devices prompts development of smaller Automatic Speech Recognition (ASR) models. To obtain a smaller model, one can employ the model compression techniques. Knowledge distillation (KD) is a popular…

Sound · Computer Science 2022-10-04 Jash Rathod , Nauman Dawalatabad , Shatrughan Singh , Dhananjaya Gowda

Knowledge distillation (KD) is a widely used technique to transfer knowledge from a large teacher network to a smaller student model. Traditional KD uses a fixed balancing factor alpha as a hyperparameter to combine the hard-label…

Computer Vision and Pattern Recognition · Computer Science 2025-09-09 Zhengda Li

Knowledge distillation (KD) is commonly deemed as an effective model compression technique in which a compact model (student) is trained under the supervision of a larger pretrained model or an ensemble of models (teacher). Various…

Machine Learning · Computer Science 2020-07-08 Fahad Sarfraz , Elahe Arani , Bahram Zonooz

Dense visual prediction tasks, such as detection and segmentation, are crucial for time-critical applications (e.g., autonomous driving and video surveillance). While deep models achieve strong performance, their efficiency remains a…

Computer Vision and Pattern Recognition · Computer Science 2025-03-11 Qizhen Lan , Qing Tian

Knowledge distillation (KD) improves the performance of a low-complexity student model with the help of a more powerful teacher. The teacher in KD is a black-box model, imparting knowledge to the student only through its predictions. This…

Machine Learning · Computer Science 2023-10-05 Sayantan Chowdhury , Ben Liang , Ali Tizghadam , Ilijc Albanese

Automated audio captioning (AAC) is an audio-to-text task to describe audio contents in natural language. Recently, the advancements in large language models (LLMs), with improvements in training approaches for audio encoders, have opened…

Sound · Computer Science 2024-06-26 Jizhong Liu , Gang Li , Junbo Zhang , Heinrich Dinkel , Yongqing Wang , Zhiyong Yan , Yujun Wang , Bin Wang
‹ Prev 1 2 3 10 Next ›