English
Related papers

Related papers: NDVQ: Robust Neural Audio Codec with Normal Distri…

200 papers

Residual Vector Quantization (RVQ) has become a dominant approach in neural speech and audio coding, providing high-fidelity compression. However, speech coding presents additional challenges due to real-world noise, which degrades…

Sound · Computer Science 2025-06-23 Yunkee Chae , Kyogu Lee

The residual vector quantization (RVQ) technique plays a central role in recent advances in neural audio codecs. These models effectively synthesize high-fidelity audio from a limited number of codes due to the hierarchical structure among…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-24 Hyeongju Kim , Junhyeok Lee , Jacob Morton , Juheon Lee , Jinhyeok Yang

Current neural audio codecs typically use residual vector quantization (RVQ) to discretize speech signals. However, they often experience codebook collapse, which reduces the effective codebook size and leads to suboptimal performance. To…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-12 Rui-Chen Zheng , Hui-Peng Du , Xiao-Hang Jiang , Yang Ai , Zhen-Hua Ling

Recent state-of-the-art neural audio compression models have progressively adopted residual vector quantization (RVQ). Despite this success, these models employ a fixed number of codebooks per frame, which can be suboptimal in terms of…

The rapid growth of visual data under stringent storage and bandwidth constraints makes extremely low-bitrate image compression increasingly important. While Vector Quantization (VQ) offers strong structural fidelity, existing methods lack…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Shiyin Jiang , Wei Long , Minghao Han , Zhenghao Chen , Ce Zhu , Shuhang Gu

Neural audio codecs have recently gained popularity because they can represent audio signals with high fidelity at very low bitrates, making it feasible to use language modeling approaches for audio generation and understanding. Residual…

Sound · Computer Science 2024-10-21 Hubert Siuzdak , Florian Grötschla , Luca A. Lanzendörfer

In theory, vector quantization (VQ) is always better than scalar quantization (SQ) in terms of rate-distortion (R-D) performance. Recent state-of-the-art methods for neural image compression are mainly based on nonlinear transform coding…

Computer Vision and Pattern Recognition · Computer Science 2023-05-26 Runsen Feng , Zongyu Guo , Weiping Li , Zhibo Chen

Vector quantization (VQ) transforms continuous image features into discrete representations, providing compressed, tokenized inputs for generative models. However, VQ-based frameworks suffer from several issues, such as non-smooth latent…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Sicheng Yang , Xing Hu , Qiang Wu , Dawei Yang

Neural audio compression has emerged as a promising technology for efficiently representing speech, music, and general audio. However, existing methods suffer from significant performance degradation at limited bitrates, where the available…

Sound · Computer Science 2026-05-08 Jin Wang , Wenbin Jiang , Xiangbo Wang , Yubo You , Sheng Fang

Vector Quantization (VQ) underpins many modern generative frameworks such as VQ-VAE, VQ-GAN, and latent diffusion models. Yet, it suffers from the persistent problem of codebook collapse, where a large fraction of code vectors remains…

Computer Vision and Pattern Recognition · Computer Science 2026-02-24 Hao Lu , Onur C. Koyun , Yongxin Guo , Zhengjie Zhu , Abbas Alili , Metin Nafi Gurcan

Vector Quantization (VQ) is a well-known technique in deep learning for extracting informative discrete latent representations. VQ-embedded models have shown impressive results in a range of applications including image and speech…

Machine Learning · Computer Science 2023-10-05 Tanmay Gautam , Reid Pryzant , Ziyi Yang , Chenguang Zhu , Somayeh Sojoudi

Neural audio coding has emerged as a vivid research direction by promising good audio quality at very low bitrates unachievable by classical coding techniques. Here, end-to-end trainable autoencoder-like models represent the state of the…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-20 Andreas Brendel , Nicola Pia , Kishan Gupta , Lyonel Behringer , Guillaume Fuchs , Markus Multrus

Although quantum computing holds promise for solving Combinatorial Optimization Problems (COPs), the limited qubit capacity of NISQ hardware makes large-scale instances intractable. Conventional methods attempt to bridge this gap through…

Quantum Physics · Physics 2026-01-21 Yuhan Huang , Siyuan Jin , Yichi Zhang , Qi Zhao , Jun Qi , Qiming Shao

We present a Split Vector Quantized Variational Autoencoder (SVQ-VAE) architecture using a split vector quantizer for NTTS, as an enhancement to the well-known Variational Autoencoder (VAE) and Vector Quantized Variational Autoencoder…

Sound · Computer Science 2023-09-15 Marek Strong , Jonas Rohnke , Antonio Bonafonte , Mateusz Łajszczak , Trevor Wood

Neural Audio Codecs (NACs) have become increasingly adopted in speech processing tasks due to their excellent rate-distortion performance and compatibility with Large Language Models (LLMs) as discrete feature representations for audio…

Sound · Computer Science 2025-09-15 Harry Julian , Rachel Beeson , Lohith Konathala , Johanna Ulin , Jiameng Gao

Neural audio codec (NAC) is essential for reconstructing high-quality speech signals and generating discrete representations for downstream speech language models. However, ensuring accurate semantic modeling while maintaining high-fidelity…

Audio and Speech Processing · Electrical Eng. & Systems 2026-03-03 Yanzhou Ren , Noboru Harada , Daiki Takeuchi , Siyu Chen , Wei Liu , Xiao Zhang , Liyuan Zhang , Takehiro Moriya , Shoji Makino

Deep learning-based semantic communication has largely relied on analog or semi-digital transmission, which limits compatibility with modern digital communication infrastructures. Recent studies have employed vector quantization (VQ) to…

Signal Processing · Electrical Eng. & Systems 2025-10-22 Zian Meng , Qiang Li , Wenqian Tang , Mingdie Yan , Xiaohu Ge

Vector Quantization (VQ) is a method for discretizing latent representations and has become a major part of the deep learning toolkit. It has been theoretically and empirically shown that discretization of representations leads to improved…

Machine Learning · Computer Science 2022-02-04 Dianbo Liu , Alex Lamb , Xu Ji , Pascal Notsawo , Mike Mozer , Yoshua Bengio , Kenji Kawaguchi

Embedding vectors are widely used for representing unstructured data and searching through it for semantically similar items. However, the large size of these vectors, due to their high-dimensionality, creates problems for modern vector…

Machine Learning · Computer Science 2025-09-24 Mariano Tepper , Ted Willke

In this paper, we explore vector quantization for acoustic unit discovery. Leveraging unlabelled data, we aim to learn discrete representations of speech that separate phonetic content from speaker-specific details. We propose two neural…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-20 Benjamin van Niekerk , Leanne Nortje , Herman Kamper
‹ Prev 1 2 3 10 Next ›