Learned Video Compression for YUV 4:2:0 Content Using Flow-based Conditional Inter-frame Coding
Abstract
This paper proposes a learning-based video compression framework for variable-rate coding on YUV 4:2:0 content. Most existing learning-based video compression models adopt the traditional hybrid-based coding architecture, which involves temporal prediction followed by residual coding. However, recent studies have shown that residual coding is sub-optimal from the information-theoretic perspective. In addition, most existing models are optimized with respect to RGB content. Furthermore, they require separate models for variable-rate coding. To address these issues, this work presents an attempt to incorporate the conditional inter-frame coding for YUV 4:2:0 content. We introduce a conditional flow-based inter-frame coder to improve the inter-frame coding efficiency. To adapt our codec to YUV 4:2:0 content, we adopt a simple strategy of using space-to-depth and depth-to-space conversions. Lastly, we employ a rate-adaption net to achieve variable-rate coding without training multiple models. Experimental results show that our model performs better than x265 on UVG and MCL-JCV datasets in terms of PSNR-YUV. However, on the more challenging datasets from ISCAS'22 GC, there is still ample room for improvement. This insufficient performance is due to the lack of inter-frame coding capability at a large GOP size and can be mitigated by increasing the model capacity and applying an error propagation-aware training strategy.
Cite
@article{arxiv.2210.08225,
title = {Learned Video Compression for YUV 4:2:0 Content Using Flow-based Conditional Inter-frame Coding},
author = {Yung-Han Ho and Chih-Hsuan Lin and Peng-Yu Chen and Mu-Jung Chen and Chih-Peng Chang and Wen-Hsiao Peng and Hsueh-Ming Hang},
journal= {arXiv preprint arXiv:2210.08225},
year = {2022}
}
Comments
Accepted by ISCAS 2022