Related papers: Variable Rate Learned Wavelet Video Coding using T…
We present an end-to-end trainable wavelet video coder based on motion-compensated temporal filtering (MCTF). Thereby, we introduce a different coding scheme for learned video compression, which is currently dominated by residual and…
We propose a novel motion estimation/compensation (ME/MC) method for wavelet-based (in-band) motion compensated temporal filtering (MCTF), with application to low-bitrate video coding. Unlike the conventional in-band MCTF algorithms, which…
Learned wavelet image and video coding approaches provide an explainable framework with a latent space corresponding to a wavelet decomposition. The wavelet image coder iWave++ achieves state-of-the-art performance and has been employed for…
Scalable lossless video coding is an important aspect for many professional applications. Wavelet-based video coding decomposes an input sequence into a lowpass and a highpass subband by filtering along the temporal axis. The lowpass…
Video temporal dynamics is conventionally modeled with 3D spatial-temporal kernel or its factorized version comprised of 2D spatial kernel and 1D temporal kernel. The modeling power, nevertheless, is limited by the fixed window size and…
Latent Video Diffusion Models (LVDMs) rely on Variational Autoencoders (VAEs) to compress videos into compact latent representations. For continuous Variational Autoencoders (VAEs), achieving higher compression rates is desirable; yet, the…
Neural video compression (NVC) is a rapidly evolving video coding research area, with some models achieving superior coding efficiency compared to the latest video coding standard Versatile Video Coding (VVC). In conventional video coding…
Despite the recent success of neural networks in image feature learning, a major problem in the video domain is the lack of sufficient labeled data for learning to model temporal information. In this paper, we propose an unsupervised…
In HTTP Adaptive Streaming, video content is conventionally encoded by adapting its spatial resolution and quantization level to best match the prevailing network state and display characteristics. It is well known that the traditional…
Temporal cues in videos provide important information for recognizing actions accurately. However, temporal-discriminative features can hardly be extracted without using an annotated large-scale video action dataset for training. This paper…
Training neural video codec (NVC) with variable rate is a highly challenging task due to its complex training strategies and model structure. In this paper, we train an efficient variable bitrate neural video codec (EV-NVC) with the…
Efficient lossless coding of medical volume data with temporal axis can be achieved by motion compensated wavelet lifting. As side benefit, a scalable bit stream is generated, which allows for displaying the data at different resolution…
While learned video codecs have demonstrated great promise, they have yet to achieve sufficient efficiency for practical deployment. In this work, we propose several novel ideas for learned video compression which allow for improved…
Deep learning has enabled significant advances in feedback-based channel coding, yet existing learned schemes remain fundamentally limited: they employ fixed block lengths, suffer degraded performance at high rates, and cannot fully exploit…
In this paper, we newly introduce the concept of temporal attention filters, and describe how they can be used for human activity recognition from videos. Many high-level activities are often composed of multiple temporal parts (e.g.,…
Recently, learned video compression has achieved exciting performance. Following the traditional hybrid prediction coding framework, most learned methods generally adopt the motion estimation motion compensation (MEMC) method to remove…
Video prediction is a pixel-wise dense prediction task to infer future frames based on past frames. Missing appearance details and motion blur are still two major problems for current predictive models, which lead to image distortion and…
Learned video compression methods have gained a variety of interest in the video coding community since they have matched or even exceeded the rate-distortion (RD) performance of traditional video codecs. However, many current…
Pareto-front optimization is crucial for addressing the multi-objective challenges in video streaming, enabling the identification of optimal trade-offs between conflicting goals such as bitrate, video quality, and decoding complexity. This…
This paper introduces an online motion rate adaptation scheme for learned video compression, with the aim of achieving content-adaptive coding on individual test sequences to mitigate the domain gap between training and test data. It…