Related papers: Variable Bitrate Residual Vector Quantization for …

Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ

Residual Vector Quantization (RVQ) has become a dominant approach in neural speech and audio coding, providing high-fidelity compression. However, speech coding presents additional challenges due to real-world noise, which degrades…

Sound · Computer Science 2025-06-23 Yunkee Chae , Kyogu Lee

Switchcodec: Adaptive residual-expert sparse quantization for high-fidelity neural audio coding

Recent neural audio compression models often rely on residual vector quantization for high-fidelity coding, but using a fixed number of per-frame codebooks is suboptimal for the wide variability of audio content-especially for signals that…

Sound · Computer Science 2026-05-08 Xiangbo Wang , Wenbin Jiang , Jin Wang , Yubo You , Sheng Fang , Fei Wen

Improving Test-Time Performance of RVQ-based Neural Codecs

The residual vector quantization (RVQ) technique plays a central role in recent advances in neural audio codecs. These models effectively synthesize high-fidelity audio from a limited number of codes due to the hierarchical structure among…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-24 Hyeongju Kim , Junhyeok Lee , Jacob Morton , Juheon Lee , Jinhyeok Yang

ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs

Current neural audio codecs typically use residual vector quantization (RVQ) to discretize speech signals. However, they often experience codebook collapse, which reduces the effective codebook size and leads to suboptimal performance. To…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-12 Rui-Chen Zheng , Hui-Peng Du , Xiao-Hang Jiang , Yang Ai , Zhen-Hua Ling

Cross-Scale Vector Quantization for Scalable Neural Speech Coding

Bitrate scalability is a desirable feature for audio coding in real-time communications. Existing neural audio codecs usually enforce a specific bitrate during training, so different models need to be trained for each target bitrate, which…

Sound · Computer Science 2022-07-08 Xue Jiang , Xiulian Peng , Huaying Xue , Yuan Zhang , Yan Lu

SNAC: Multi-Scale Neural Audio Codec

Neural audio codecs have recently gained popularity because they can represent audio signals with high fidelity at very low bitrates, making it feasible to use language modeling approaches for audio generation and understanding. Residual…

Sound · Computer Science 2024-10-21 Hubert Siuzdak , Florian Grötschla , Luca A. Lanzendörfer

SwitchCodec: A High-Fidelity Nerual Audio Codec With Sparse Quantization

Neural audio compression has emerged as a promising technology for efficiently representing speech, music, and general audio. However, existing methods suffer from significant performance degradation at limited bitrates, where the available…

Sound · Computer Science 2026-05-08 Jin Wang , Wenbin Jiang , Xiangbo Wang , Yubo You , Sheng Fang

NDVQ: Robust Neural Audio Codec with Normal Distribution-Based Vector Quantization

Built upon vector quantization (VQ), discrete audio codec models have achieved great success in audio compression and auto-regressive audio generation. However, existing models face substantial challenges in perceptual quality and signal…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-20 Zhikang Niu , Sanyuan Chen , Long Zhou , Ziyang Ma , Xie Chen , Shujie Liu

Differentiable Vector Quantization for Rate-Distortion Optimization of Generative Image Compression

The rapid growth of visual data under stringent storage and bandwidth constraints makes extremely low-bitrate image compression increasingly important. While Vector Quantization (VQ) offers strong structural fidelity, existing methods lack…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Shiyin Jiang , Wei Long , Minghao Han , Zhenghao Chen , Ce Zhu , Shuhang Gu

Rate-Adaptive Quantization: A Multi-Rate Codebook Adaptation for Vector Quantization-based Generative Models

Learning discrete representations with vector quantization (VQ) has emerged as a powerful approach in various generative models. However, most VQ-based models rely on a single, fixed-rate codebook, requiring extensive retraining for new…

Machine Learning · Computer Science 2025-02-03 Jiwan Seo , Joonhyuk Kang

Neural Speech Coding for Real-time Communications using Constant Bitrate Scalar Quantization

Neural audio coding has emerged as a vivid research direction by promising good audio quality at very low bitrates unachievable by classical coding techniques. Here, end-to-end trainable autoencoder-like models represent the state of the…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-20 Andreas Brendel , Nicola Pia , Kishan Gupta , Lyonel Behringer , Guillaume Fuchs , Markus Multrus

VAEVQ: Enhancing Discrete Visual Tokenization through Variational Modeling

Vector quantization (VQ) transforms continuous image features into discrete representations, providing compressed, tokenized inputs for generative models. However, VQ-based frameworks suffer from several issues, such as non-smooth latent…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Sicheng Yang , Xing Hu , Qiang Wu , Dawei Yang

Improved Residual Vector Quantization for High-dimensional Approximate Nearest Neighbor Search

Quantization methods have been introduced to perform large scale approximate nearest search tasks. Residual Vector Quantization (RVQ) is one of the effective quantization methods. RVQ uses a multi-stage codebook learning scheme to lower the…

Computer Vision and Pattern Recognition · Computer Science 2015-09-18 Shicong Liu , Hongtao Lu , Junru Shao

VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression

Neural Radiance Field (NeRF)-based volumetric video has revolutionized visual media by delivering photorealistic Free-Viewpoint Video (FVV) experiences that provide audiences with unprecedented immersion and interactivity. However, the…

Image and Video Processing · Electrical Eng. & Systems 2024-12-17 Qiang Hu , Houqiang Zhong , Zihan Zheng , Xiaoyun Zhang , Zhengxue Cheng , Li Song , Guangtao Zhai , Yanfeng Wang

On Quantizing Neural Representation for Variable-Rate Video Coding

This work introduces NeuroQuant, a novel post-training quantization (PTQ) approach tailored to non-generalized Implicit Neural Representations for variable-rate Video Coding (INR-VC). Unlike existing methods that require extensive weight…

Image and Video Processing · Electrical Eng. & Systems 2025-02-18 Junqi Shi , Zhujia Chen , Hanfei Li , Qi Zhao , Ming Lu , Tong Chen , Zhan Ma

DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick

Vector quantization is common in deep models, yet its hard assignments block gradients and hinder end-to-end training. We propose DiVeQ, which treats quantization as adding an error vector that mimics the quantization distortion, keeping…

Machine Learning · Computer Science 2026-05-27 Mohammad Hassan Vali , Tom Bäckström , Arno Solin

Residual Quantization with Implicit Neural Codebooks

Vector quantization is a fundamental operation for data compression and vector search. To obtain high accuracy, multi-codebook methods represent each vector using codewords across several codebooks. Residual quantization (RQ) is one such…

Machine Learning · Computer Science 2024-05-22 Iris A. M. Huijben , Matthijs Douze , Matthew Muckley , Ruud J. G. van Sloun , Jakob Verbeek

Residual vector quantization for KV cache compression in large language model

KV cache compression methods have mainly relied on scalar quantization techniques to reduce the memory requirements during decoding. In this work, we apply residual vector quantization, which has been widely used for high fidelity audio…

Machine Learning · Computer Science 2024-10-22 Ankur Kumar

Scalable Training for Vector-Quantized Networks with 100% Codebook Utilization

Vector quantization (VQ) is a key component in discrete tokenizers for image generation, but its training is often unstable due to straight-through estimation bias, one-step-behind updates, and sparse codebook gradients, which lead to…

Computer Vision and Pattern Recognition · Computer Science 2025-09-15 Yifan Chang , Jie Qin , Limeng Qiao , Xiaofeng Wang , Zheng Zhu , Lin Ma , Xingang Wang

Quantizer-Aware Hierarchical Neural Codec Modeling for Speech Deepfake Detection

Neural audio codecs discretize speech via residual vector quantization (RVQ), forming a coarse-to-fine hierarchy across quantizers. While codec models have been explored for representation learning, their discrete structure remains…

Sound · Computer Science 2026-03-19 Jinyang Wu , Zihan Pan , Qiquan Zhang , Sailor Hardik Bhupendra , Soumik Mondal