English
Related papers

Related papers: Single-step Diffusion-based Video Coding with Sema…

200 papers

In this work, we first propose DiffVC-OSD, a One-Step Diffusion-based Perceptual Neural Video Compression framework. Unlike conventional multi-step diffusion-based methods, DiffVC-OSD feeds the reconstructed latent representation directly…

Image and Video Processing · Electrical Eng. & Systems 2025-08-12 Wenzhuo Ma , Zhenzhong Chen

Although there have been significant advancements in image compression techniques, such as standard and learned codecs, these methods still suffer from severe quality degradation at extremely low bits per pixel. While recent diffusion-based…

Image and Video Processing · Electrical Eng. & Systems 2025-09-23 Chanung Park , Joo Chan Lee , Jong Hwan Ko

While recent diffusion-based generative image codecs have shown impressive performance, their iterative sampling process introduces unpleasing latency. In this work, we revisit the design of a diffusion-based codec and argue that multi-step…

Computer Vision and Pattern Recognition · Computer Science 2025-11-27 Naifu Xue , Zhaoyang Jia , Jiahao Li , Bin Li , Yuan Zhang , Yan Lu

Traditional video codecs optimized for pixel fidelity collapse at ultra-low bitrates and produce severe artifacts. This failure arises from a fundamental misalignment between pixel accuracy and human perception. We propose a semantic video…

Computer Vision and Pattern Recognition · Computer Science 2026-04-07 Lingdong Wang , Guan-Ming Su , Divya Kothandaraman , Tsung-Wei Huang , Mohammad Hajiesmaili , Ramesh K. Sitaraman

Diffusion-based image compression has shown remarkable potential for achieving ultra-low bitrate coding (less than 0.05 bits per pixel) with high realism, by leveraging the generative priors of large pre-trained text-to-image diffusion…

Image and Video Processing · Electrical Eng. & Systems 2025-06-30 Tianyu Zhang , Xin Luo , Li Li , Dong Liu

Modern video codecs and learning-based approaches struggle for semantic reconstruction at extremely low bit-rates due to reliance on low-level spatiotemporal redundancies. Generative models, especially diffusion models, offer a new paradigm…

Image and Video Processing · Electrical Eng. & Systems 2026-02-06 Maojun Zhang , Haotian Wu , Richeng Jin , Deniz Gunduz , Krystian Mikolajczyk

Efficient video coding is highly dependent on exploiting the temporal redundancy, which is usually achieved by extracting and leveraging the temporal context in the emerging conditional coding-based neural video codec (NVC). Although the…

Image and Video Processing · Electrical Eng. & Systems 2025-05-21 Chuanbo Tang , Zhuoyuan Li , Yifan Bian , Li Li , Dong Liu

The practical deployment of diffusion-based Neural Video Compression (NVC) faces critical challenges, including severe information loss, prohibitive inference latency, and poor temporal consistency. To bridge this gap, we propose DiffVC-RT,…

Computer Vision and Pattern Recognition · Computer Science 2026-01-29 Wenzhuo Ma , Zhenzhong Chen

Recently, Neural Video Compression (NVC) techniques have achieved remarkable performance, even surpassing the best traditional lossy video codec. However, most existing NVC methods heavily rely on transmitting Motion Vector (MV) to generate…

Computer Vision and Pattern Recognition · Computer Science 2024-06-13 Feng Wang , Haihang Ruan , Zhihuang Xie , Ronggang Wang , Xiangyu Yue

Most Neural Video Codecs (NVCs) only employ temporal references to generate temporal-only contexts and latent prior. These temporal-only NVCs fail to handle large motions or emerging objects due to limited contexts and misaligned latent…

Image and Video Processing · Electrical Eng. & Systems 2025-05-09 Yifan Bian , Chuanbo Tang , Li Li , Dong Liu

Diffusion-based image compression has demonstrated impressive perceptual performance. However, it suffers from two critical drawbacks: (1) excessive decoding latency due to multi-step sampling, and (2) poor fidelity resulting from…

Computer Vision and Pattern Recognition · Computer Science 2025-12-03 Zheng Chen , Mingde Zhou , Jinpei Guo , Jiale Yuan , Yifei Ji , Yulun Zhang

In recent years, diffusion models have made remarkable strides in text-to-video generation, sparking a quest for enhanced control over video outputs to more accurately reflect user intentions. Traditional efforts predominantly focus on…

Computer Vision and Pattern Recognition · Computer Science 2024-07-18 Mingxiao Li , Bo Wan , Marie-Francine Moens , Tinne Tuytelaars

Perceptual optimization is widely recognized as essential for neural compression, yet balancing the rate-distortion-perception tradeoff remains challenging. This difficulty is especially pronounced in video compression, where frame-wise…

Image and Video Processing · Electrical Eng. & Systems 2025-10-14 Zongyu Guo , Zhaoyang Jia , Jiahao Li , Xiaoyi Zhang , Bin Li , Yan Lu

Recently, perceptual image compression has achieved significant advancements, delivering high visual quality at low bitrates for natural images. However, for screen content, existing methods often produce noticeable artifacts when…

Computer Vision and Pattern Recognition · Computer Science 2025-05-12 Tongda Xu , Jiahao Li , Bin Li , Yan Wang , Ya-Qin Zhang , Yan Lu

Diffusion models provide a powerful generative prior for perceptual reconstruction at ultra-low bitrates, but effective video compression requires controlling the generative process using highly compact conditioning signals. In this work,…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Amirhosein Javadi , Shirin Saeedi Bidokhti , Tara Javidi

The emerging conditional coding-based neural video codec (NVC) shows superiority over commonly-used residual coding-based codec and the latest NVC already claims to outperform the best traditional codec. However, there still exist critical…

Computer Vision and Pattern Recognition · Computer Science 2024-03-01 Jiahao Li , Bin Li , Yan Lu

Ultra-high-resolution streaming and emerging immersive services are driving rapidly increasing wireless video traffic. However, perceptually pleasing video transmission over bandwidth-limited and latency-constrained wireless links remains…

Image and Video Processing · Electrical Eng. & Systems 2026-05-20 Yinhuan Huang , Zhijin Qin

Existing multimodal large model-based image compression frameworks often rely on a fragmented integration of semantic retrieval, latent compression, and generative models, resulting in suboptimal performance in both reconstruction fidelity…

Computer Vision and Pattern Recognition · Computer Science 2025-05-14 Anle Ke , Xu Zhang , Tong Chen , Ming Lu , Chao Zhou , Jiawen Gu , Zhan Ma

While recent neural codecs achieve strong performance at low bitrates when optimized for perceptual quality, their effectiveness deteriorates significantly under ultra-low bitrate conditions. To mitigate this, generative compression methods…

Computer Vision and Pattern Recognition · Computer Science 2026-02-06 Chuqin Zhou , Xiaoyue Ling , Yunuo Chen , Jincheng Dai , Guo Lu , Wenjun Zhang

Advancements in text-to-image generative AI with large multimodal models are spreading into the field of image compression, creating high-quality representation of images at extremely low bit rates. This work introduces novel components to…

Image and Video Processing · Electrical Eng. & Systems 2025-06-02 Cheng-Lin Wu , Hyomin Choi , Ivan V. Bajić
‹ Prev 1 2 3 10 Next ›