Related papers: Perceptual Vector Quantization For Video Coding

The Equalizer: Introducing Shape-Gain Decomposition in Neural Audio Codecs

Neural audio codecs (NACs) typically encode the short-term energy (gain) and normalized structure (shape) of speech/audio signals jointly within the same latent space. As a result, they are poorly robust to a global variation of the input…

Sound · Computer Science 2026-02-18 Samir Sadok , Laurent Girin , Xavier Alameda-Pineda

Exploiting Latent Properties to Optimize Neural Codecs

End-to-end image and video codecs are becoming increasingly competitive, compared to traditional compression techniques that have been developed through decades of manual engineering efforts. These trainable codecs have many advantages over…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Muhammet Balcilar , Bharath Bhushan Damodaran , Karam Naser , Franck Galpin , Pierre Hellier

Conditional Coding and Variable Bitrate for Practical Learned Video Coding

This paper introduces a practical learned video codec. Conditional coding and quantization gain vectors are used to provide flexibility to a single encoder/decoder pair, which is able to compress video sequences at a variable bitrate. The…

Neural and Evolutionary Computing · Computer Science 2021-04-21 Théo Ladune , Pierrick Philippe , Wassim Hamidouche , Lu Zhang , Olivier Déforges

Prediction-Aware Quality Enhancement of VVC Using CNN

The upcoming video coding standard, Versatile Video Coding (VVC), has shown great improvement compared to its predecessor, High Efficiency Video Coding (HEVC), in terms of bitrate saving. Despite its substantial performance, compressed…

Image and Video Processing · Electrical Eng. & Systems 2021-12-09 Fatemeh Nasiri , Wassim Hamidouche , Luce Morin , Nicolas Dhollande , Gildas Cocherel

Prediction of Transformed (DCT) Video Coding Residual for Video Compression

Video compression has been investigated by means of analysis-synthesis, and more particularly by means of inpainting. The first part of our approach has been to develop the inpainting of DCT coefficients in an image. This has shown good…

Information Theory · Computer Science 2014-04-17 Matthieu Moinard , Isabelle Amonou , Pierre Duhamel , Patrice Brault

Residual vector quantization for KV cache compression in large language model

KV cache compression methods have mainly relied on scalar quantization techniques to reduce the memory requirements during decoding. In this work, we apply residual vector quantization, which has been widely used for high fidelity audio…

Machine Learning · Computer Science 2024-10-22 Ankur Kumar

Improving Pyramid Vector Quantizer with power projection

Pyramid Vector Quantizer (PVQ) is a promising technique especially for multimedia data compression, already used in Opus audio codec and considered for AV1 video codec. It quantizes vectors from Euclidean unit sphere by first projecting…

Optimization and Control · Mathematics 2017-05-16 Jarek Duda

Variable Bitrate Residual Vector Quantization for Audio Coding

Recent state-of-the-art neural audio compression models have progressively adopted residual vector quantization (RVQ). Despite this success, these models employ a fixed number of codebooks per frame, which can be suboptimal in terms of…

Sound · Computer Science 2025-04-29 Yunkee Chae , Woosung Choi , Yuhta Takida , Junghyun Koo , Yukara Ikemiya , Zhi Zhong , Kin Wai Cheuk , Marco A. Martínez-Ramírez , Kyogu Lee , Wei-Hsiang Liao , Yuki Mitsufuji

Predictive Coding For Animation-Based Video Compression

We address the problem of efficiently compressing video for conferencing-type applications. We build on recent approaches based on image animation, which can achieve good reconstruction quality at very low bitrate by representing face…

Computer Vision and Pattern Recognition · Computer Science 2023-07-11 Goluck Konuko , Stéphane Lathuilière , Giuseppe Valenzise

Neural Speech Coding for Real-time Communications using Constant Bitrate Scalar Quantization

Neural audio coding has emerged as a vivid research direction by promising good audio quality at very low bitrates unachievable by classical coding techniques. Here, end-to-end trainable autoencoder-like models represent the state of the…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-20 Andreas Brendel , Nicola Pia , Kishan Gupta , Lyonel Behringer , Guillaume Fuchs , Markus Multrus

Rate-Aware Learned Speech Compression

The rapid rise of real-time communication and large language models has significantly increased the importance of speech compression. Deep learning-based neural speech codecs have outperformed traditional signal-level speech codecs in terms…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-22 Jun Xu , Zhengxue Cheng , Guangchuan Chi , Yuhan Liu , Yuelin Hu , Li Song

Optimized Decoding-Energy-Aware Encoding in Practical VVC Implementations

The optimization of the energy demand is crucial for modern video codecs. Previous studies show that the energy demand of VVC decoders can be improved by more than 50% if specific coding tools are disabled in the encoder. However, those…

Image and Video Processing · Electrical Eng. & Systems 2024-02-16 Matthias Kränzler , Adam Wieckowski , Geetha Ramasubbu , Benjamin Bross , André Kaup , Detlev Marpe , Christian Herglotz

Optimizing Learned Image Compression on Scalar and Entropy-Constraint Quantization

The continuous improvements on image compression with variational autoencoders have lead to learned codecs competitive with conventional approaches in terms of rate-distortion efficiency. Nonetheless, taking the quantization into account…

Machine Learning · Computer Science 2025-06-11 Florian Borzechowski , Michael Schäfer , Heiko Schwarz , Jonathan Pfaff , Detlev Marpe , Thomas Wiegand

Decoding-Energy-Rate-Distortion Optimization for Video Coding

This paper presents a method for generating coded video bit streams requiring less decoding energy than conventionally coded bit streams. To this end, we propose extending the standard rate-distortion optimization approach to also consider…

Image and Video Processing · Electrical Eng. & Systems 2022-03-03 Christian Herglotz , Andreas Heindel , André Kaup

On Intra Video Coding and In-loop Filtering for Neural Object Detection Networks

Classical video coding for satisfying humans as the final user is a widely investigated field of studies for visual content, and common video codecs are all optimized for the human visual system (HVS). But are the assumptions and…

Image and Video Processing · Electrical Eng. & Systems 2022-03-14 Kristian Fischer , Christian Herglotz , André Kaup

Cross-Scale Vector Quantization for Scalable Neural Speech Coding

Bitrate scalability is a desirable feature for audio coding in real-time communications. Existing neural audio codecs usually enforce a specific bitrate during training, so different models need to be trained for each target bitrate, which…

Sound · Computer Science 2022-07-08 Xue Jiang , Xiulian Peng , Huaying Xue , Yuan Zhang , Yan Lu

Video coding technique with parametric modeling of noise

This paper presents a video encoding method in which noise is encoded using a novel parametric model representing spectral envelope and spatial distribution of energy. The proposed method has been experimentally assessed using video test…

Image and Video Processing · Electrical Eng. & Systems 2019-09-04 Olgierd Stankiewicz

Decoding Energy Modeling For Versatile Video Coding

In previous research, it was shown that the software decoding energy demand of High Efficiency Video Coding (HEVC) can be reduced by 15$\%$ by using a decoding-energy-rate-distortion optimization algorithm. To achieve this, the energy…

Image and Video Processing · Electrical Eng. & Systems 2022-09-22 Matthias Kränzler , Christian Herglotz , André Kaup

Towards Video Codec Performance Evaluation: A Rate-Energy-Distortion Perspective

The Bj{\o}ntegaard Delta rate (BD-rate) objectively assesses the coding efficiency of video codecs using the rate-distortion (R-D) performance but overlooks encoding energy, which is crucial in practical applications, especially for those…

Image and Video Processing · Electrical Eng. & Systems 2024-10-03 Geetha Ramasubbu , André Kaup , Christian Herglotz

CODA: Repurposing Continuous VAEs for Discrete Tokenization

Discrete visual tokenizers transform images into a sequence of tokens, enabling token-based visual generation akin to language models. However, this process is inherently challenging, as it requires both compressing visual signals into a…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Zeyu Liu , Zanlin Ni , Yeguo Hua , Xin Deng , Xiao Ma , Cheng Zhong , Gao Huang