Related papers: Multi-Reference Video Coding Using Stillness Detec…

AV1 Video Coding Using Texture Analysis With Convolutional Neural Networks

Modern video codecs including the newly developed AOM/AV1 utilize hybrid coding techniques to remove spatial and temporal redundancy. However, efficient exploitation of statistical dependencies measured by a mean squared error (MSE) does…

Image and Video Processing · Electrical Eng. & Systems 2018-04-26 Di Chen , Chichen Fu , Fengqing Zhu

Fast Block Structure Determination in AV1-based Multiple Resolutions Video Encoding

The widely used adaptive HTTP streaming requires an efficient algorithm to encode the same video to different resolutions. In this paper, we propose a fast block structure determination algorithm based on the AV1 codec that accelerates high…

Multimedia · Computer Science 2018-10-17 Bichuan Guo , Yuxing Han , Jiangtao Wen

Audio-Visual Cross-Modal Compression for Generative Face Video Coding

Generative face video coding (GFVC) is vital for modern applications like video conferencing, yet existing methods primarily focus on video motion while neglecting the significant bitrate contribution of audio. Despite the well-established…

Image and Video Processing · Electrical Eng. & Systems 2025-12-18 Youmin Xu , Mengxi Guo , Shijie Zhao , Weiqi Li , Junlin Li , Li Zhang , Jian Zhang

Multi-resolution encoding and optimization for next generation video compression

Multi-encoding implies encoding the same content in multiple spatial resolutions and multiple bitrates. This work evaluates the encoder analysis correlations across 2160p, 1080p, and 540p encodings of the same video for conventional ABR…

Multimedia · Computer Science 2023-01-31 Vignesh V Menon

VVC Extension Scheme for Object Detection Using Contrast Reduction

In recent years, video analysis using Artificial Intelligence (AI) has been widely used, due to the remarkable development of image recognition technology using deep learning. In 2019, the Moving Picture Experts Group (MPEG) has started…

Computer Vision and Pattern Recognition · Computer Science 2023-05-31 Takahiro Shindo , Taiju Watanabe , Kein Yamada , Hiroshi Watanabe

I$^2$VC: A Unified Framework for Intra- & Inter-frame Video Compression

Video compression aims to reconstruct seamless frames by encoding the motion and residual information from existing frames. Previous neural video compression methods necessitate distinct codecs for three types of frames (I-frame, P-frame…

Image and Video Processing · Electrical Eng. & Systems 2024-06-04 Meiqin Liu , Chenming Xu , Yukai Gu , Chao Yao , Yao Zhao

Saliency-Driven Versatile Video Coding for Neural Object Detection

Saliency-driven image and video coding for humans has gained importance in the recent past. In this paper, we propose such a saliency-driven coding framework for the video coding for machines task using the latest video coding standard…

Computer Vision and Pattern Recognition · Computer Science 2022-03-14 Kristian Fischer , Felix Fleckenstein , Christian Herglotz , André Kaup

A Neural-network Enhanced Video Coding Framework beyond ECM

In this paper, a hybrid video compression framework is proposed that serves as a demonstrative showcase of deep learning-based approaches extending beyond the confines of traditional coding methodologies. The proposed hybrid framework is…

Computer Vision and Pattern Recognition · Computer Science 2024-02-22 Yanchen Zhao , Wenxuan He , Chuanmin Jia , Qizhe Wang , Junru Li , Yue Li , Chaoyi Lin , Kai Zhang , Li Zhang , Siwei Ma

Accuracy Improvement of Object Detection in VVC Coded Video Using YOLO-v7 Features

With advances in image recognition technology based on deep learning, automatic video analysis by Artificial Intelligence is becoming more widespread. As the amount of video used for image recognition increases, efficient compression…

Computer Vision and Pattern Recognition · Computer Science 2023-04-04 Takahiro Shindo , Taiju Watanabe , Kein Yamada , Hiroshi Watanabe

Content-Aware Preserving Image Generation

Remarkable progress has been achieved in image generation with the introduction of generative models. However, precisely controlling the content in generated images remains a challenging task due to their fundamental training objective.…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Giang H. Le , Anh Q. Nguyen , Byeongkeun Kang , Yeejin Lee

Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics

Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and…

Computer Vision and Pattern Recognition · Computer Science 2023-07-19 Ling-Yu Duan , Jiaying Liu , Wenhan Yang , Tiejun Huang , Wen Gao

Dynamic Group Detection using VLM-augmented Temporal Groupness Graph

This paper proposes dynamic human group detection in videos. For detecting complex groups, not only the local appearance features of in-group members but also the global context of the scene are important. Such local and global appearance…

Computer Vision and Pattern Recognition · Computer Science 2025-09-08 Kaname Yokoyama , Chihiro Nakatani , Norimichi Ukita

An Emerging Coding Paradigm VCM: A Scalable Coding Approach Beyond Feature and Signal

In this paper, we study a new problem arising from the emerging MPEG standardization effort Video Coding for Machine (VCM), which aims to bridge the gap between visual feature compression and classical video coding. VCM is committed to…

Image and Video Processing · Electrical Eng. & Systems 2020-01-10 Sifeng Xia , Kunchangtai Liang , Wenhan Yang , Ling-Yu Duan , Jiaying Liu

Texture Segmentation Based Video Compression Using Convolutional Neural Networks

There has been a growing interest in using different approaches to improve the coding efficiency of modern video codec in recent years as demand for web-based video consumption increases. In this paper, we propose a model-based approach…

Computer Vision and Pattern Recognition · Computer Science 2018-02-09 Chichen Fu , Di Chen , Edward J. Delp , Zoe Liu , Fengqing Zhu

Variable Rate Video Compression using a Hybrid Recurrent Convolutional Learning Framework

In recent years, neural network-based image compression techniques have been able to outperform traditional codecs and have opened the gates for the development of learning-based video codecs. However, to take advantage of the high temporal…

Image and Video Processing · Electrical Eng. & Systems 2020-08-25 Aishwarya Jadhav

Recent Standard Development Activities on Video Coding for Machines

In recent years, video data has dominated internet traffic and becomes one of the major data formats. With the emerging 5G and internet of things (IoT) technologies, more and more videos are generated by edge devices, sent across networks,…

Computer Vision and Pattern Recognition · Computer Science 2021-05-27 Wen Gao , Shan Liu , Xiaozhong Xu , Manouchehr Rafie , Yuan Zhang , Igor Curcio

Multimodal Alignment with Cross-Attentive GRUs for Fine-Grained Video Understanding

Fine-grained video classification requires understanding complex spatio-temporal and semantic cues that often exceed the capacity of a single modality. In this paper, we propose a multimodal framework that fuses video, image, and text…

Computer Vision and Pattern Recognition · Computer Science 2025-07-08 Namho Kim , Junhwa Kim

End-to-End Learning for Video Frame Compression with Self-Attention

One of the core components of conventional (i.e., non-learned) video codecs consists of predicting a frame from a previously-decoded frame, by leveraging temporal correlations. In this paper, we propose an end-to-end learned system for…

Image and Video Processing · Electrical Eng. & Systems 2020-04-22 Nannan Zou , Honglei Zhang , Francesco Cricri , Hamed R. Tavakoli , Jani Lainema , Emre Aksu , Miska Hannuksela , Esa Rahtu

A Modified Fourier-Mellin Approach for Source Device Identification on Stabilized Videos

To decide whether a digital video has been captured by a given device, multimedia forensic tools usually exploit characteristic noise traces left by the camera sensor on the acquired frames. This analysis requires that the noise pattern…

Multimedia · Computer Science 2020-05-21 Sara Mandelli , Fabrizio Argenti , Paolo Bestagini , Massimo Iuliani , Alessandro Piva , Stefano Tubaro

Updated version: A Video Anomaly Detection Framework based on Appearance-Motion Semantics Representation Consistency

Video anomaly detection is an essential but challenging task. The prevalent methods mainly investigate the reconstruction difference between normal and abnormal patterns but ignore the semantics consistency between appearance and motion…

Computer Vision and Pattern Recognition · Computer Science 2023-03-10 Xiangyu Huang , Caidan Zhao , Zhiqiang Wu