Related papers: Model Compression for DNN-based Speaker Verificati…

Optimization of DNN-based speaker verification model through efficient quantization technique

As Deep Neural Networks (DNNs) rapidly advance in various fields, including speech verification, they typically involve high computational costs and substantial memory consumption, which can be challenging to manage on mobile systems.…

Audio and Speech Processing · Electrical Eng. & Systems 2024-07-15 Yeona Hong , Woo-Jin Chung , Hong-Goo Kang

Model compression via distillation and quantization

Deep neural networks (DNNs) continue to make significant advances, solving tasks from image classification to translation or reinforcement learning. One aspect of the field receiving considerable attention is efficiently executing deep…

Neural and Evolutionary Computing · Computer Science 2018-02-16 Antonio Polino , Razvan Pascanu , Dan Alistarh

Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization

Modern speaker verification (SV) systems typically demand expensive storage and computing resources, thereby hindering their deployment on mobile devices. In this paper, we explore adaptive neural network quantization for lightweight…

Audio and Speech Processing · Electrical Eng. & Systems 2024-12-03 Bei Liu , Haoyu Wang , Yanmin Qian

Retraining-Based Iterative Weight Quantization for Deep Neural Networks

Model compression has gained a lot of attention due to its ability to reduce hardware resource requirements significantly while maintaining accuracy of DNNs. Model compression is especially useful for memory-intensive recurrent neural…

Machine Learning · Computer Science 2018-05-30 Dongsoo Lee , Byeongwook Kim

Efficient Black-Box Speaker Verification Model Adaptation with Reprogramming and Backend Learning

The development of deep neural networks (DNN) has significantly enhanced the performance of speaker verification (SV) systems in recent years. However, a critical issue that persists when applying DNN-based SV systems in practical…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-26 Jingyu Li , Tan Lee

Weight Normalization based Quantization for Deep Neural Network Compression

With the development of deep neural networks, the size of network models becomes larger and larger. Model compression has become an urgent need for deploying these network models to mobile or embedded devices. Model quantization is a…

Machine Learning · Computer Science 2019-07-02 Wen-Pu Cai , Wu-Jun Li

Optimizing Deep Neural Networks using Safety-Guided Self Compression

The deployment of deep neural networks on resource-constrained devices necessitates effective model com- pression strategies that judiciously balance the reduction of model size with the preservation of performance. This study introduces a…

Machine Learning · Computer Science 2025-05-02 Mohammad Zbeeb , Mariam Salman , Mohammad Bazzi , Ammar Mohanna

VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization

Quantization has been proven to be an effective method for reducing the computing and/or storage cost of DNNs. However, the trade-off between the quantization bitwidth and final accuracy is complex and non-convex, which makes it difficult…

Computer Vision and Pattern Recognition · Computer Science 2020-06-11 Cheng Gong , Yao Chen , Ye Lu , Tao Li , Cong Hao , Deming Chen

Quantization of Deep Neural Networks for Accurate Edge Computing

Deep neural networks (DNNs) have demonstrated their great potential in recent years, exceeding the per-formance of human experts in a wide range of applications. Due to their large sizes, however, compressiontechniques such as weight…

Computer Vision and Pattern Recognition · Computer Science 2021-10-15 Wentao Chen , Hailong Qiu , Jian Zhuang , Chutong Zhang , Yu Hu , Qing Lu , Tianchen Wang , Yiyu Shi , Meiping Huang , Xiaowe Xu

Empirical Evaluation of Deep Learning Model Compression Techniques on the WaveNet Vocoder

WaveNet is a state-of-the-art text-to-speech vocoder that remains challenging to deploy due to its autoregressive loop. In this work we focus on ways to accelerate the original WaveNet architecture directly, as opposed to modifying the…

Machine Learning · Computer Science 2020-11-23 Sam Davis , Giuseppe Coccia , Sam Gooch , Julian Mack

Structured Multi-Hashing for Model Compression

Despite the success of deep neural networks (DNNs), state-of-the-art models are too large to deploy on low-resource devices or common server configurations in which multiple models are held in memory. Model compression methods address this…

Machine Learning · Computer Science 2019-11-27 Elad Eban , Yair Movshovitz-Attias , Hao Wu , Mark Sandler , Andrew Poon , Yerlan Idelbayev , Miguel A. Carreira-Perpinan

Neural Networks Weights Quantization: Target None-retraining Ternary (TNT)

Quantization of weights of deep neural networks (DNN) has proven to be an effective solution for the purpose of implementing DNNs on edge devices such as mobiles, ASICs and FPGAs, because they have no sufficient resources to support…

Machine Learning · Computer Science 2019-12-20 Tianyu Zhang , Lei Zhu , Qian Zhao , Kilho Shin

Low-bit Shift Network for End-to-End Spoken Language Understanding

Deep neural networks (DNN) have achieved impressive success in multiple domains. Over the years, the accuracy of these models has increased with the proliferation of deeper and more complex architectures. Thus, state-of-the-art solutions…

Sound · Computer Science 2022-07-18 Anderson R. Avila , Khalil Bibi , Rui Heng Yang , Xinlin Li , Chao Xing , Xiao Chen

Efficient VQ-QAT and Mixed Vector/Linear quantized Neural Networks

In this work, we developed and tested 3 techniques for vector quantization (VQ) based model weight compression. To mitigate codebook collapse and enable end-to-end training, we adopted cosine similarity-based assignment. Building on ideas…

Machine Learning · Computer Science 2026-04-28 Terry Gou , Puneet Gupta

Compression strategies and space-conscious representations for deep neural networks

Recent advances in deep learning have made available large, powerful convolutional neural networks (CNN) with state-of-the-art performance in several real-world applications. Unfortunately, these large-sized models have millions of…

Machine Learning · Computer Science 2020-07-17 Giosuè Cataldo Marinò , Gregorio Ghidoli , Marco Frasca , Dario Malchiodi

Universal Deep Neural Network Compression

In this paper, we investigate lossy compression of deep neural networks (DNNs) by weight quantization and lossless source coding for memory-efficient deployment. Whereas the previous work addressed non-universal scalar quantization and…

Computer Vision and Pattern Recognition · Computer Science 2019-02-22 Yoojin Choi , Mostafa El-Khamy , Jungwon Lee

Model compression as constrained optimization, with application to neural nets. Part II: quantization

We consider the problem of deep neural net compression by quantization: given a large, reference net, we want to quantize its real-valued weights using a codebook with $K$ entries so that the training loss of the quantized net is minimal.…

Machine Learning · Computer Science 2017-07-17 Miguel Á. Carreira-Perpiñán , Yerlan Idelbayev

Towards Lightweight Applications: Asymmetric Enroll-Verify Structure for Speaker Verification

With the development of deep learning, automatic speaker verification has made considerable progress over the past few years. However, to design a lightweight and robust system with limited computational resources is still a challenging…

Sound · Computer Science 2022-01-27 Qingjian Lin , Lin Yang , Xuyang Wang , Xiaoyi Qin , Junjie Wang , Ming Li

Binary Neural Network for Speaker Verification

Although deep neural networks are successful for many tasks in the speech domain, the high computational and memory costs of deep neural networks make it difficult to directly deploy highperformance Neural Network systems on low-resource…

Sound · Computer Science 2021-04-07 Tinglong Zhu , Xiaoyi Qin , Ming Li

Extreme Model Compression with Structured Sparsity at Low Precision

Deep neural networks (DNNs) are used in many applications, but their large size and high computational cost make them hard to run on devices with limited resources. Two widely used techniques to address this challenge are weight…

Computer Vision and Pattern Recognition · Computer Science 2025-11-12 Dan Liu , Nikita Dvornik , Xue Liu