Related papers: Low Bandwidth Video-Chat Compression using Deep Ge…

Efficient conditioned face animation using frontally-viewed embedding

As the quality of few shot facial animation from landmarks increases, new applications become possible, such as ultra low bandwidth video chat compression with a high degree of realism. However, there are some important challenges to tackle…

Computer Vision and Pattern Recognition · Computer Science 2022-03-17 Maxime Oquab , Daniel Haziza , Ludovic Schwartz , Tao Xu , Katayoun Zand , Rui Wang , Peirong Liu , Camille Couprie

Ultra-low bitrate video conferencing using deep image animation

In this work we propose a novel deep learning approach for ultra-low bitrate video compression for video conferencing applications. To address the shortcomings of current video compression paradigms when the available bandwidth is extremely…

Computer Vision and Pattern Recognition · Computer Science 2020-12-02 Goluck Konuko , Giuseppe Valenzise , Stéphane Lathuilière

Identity-Preserving Talking Face Generation with Landmark and Appearance Priors

Generating talking face videos from audio attracts lots of research interest. A few person-specific methods can generate vivid videos but require the target speaker's videos for training or fine-tuning. Existing person-generic methods have…

Computer Vision and Pattern Recognition · Computer Science 2023-05-16 Weizhi Zhong , Chaowei Fang , Yinqi Cai , Pengxu Wei , Gangming Zhao , Liang Lin , Guanbin Li

Generative Semantic Coding for Ultra-Low Bitrate Visual Communication and Analysis

We consider the problem of ultra-low bit rate visual communication for remote vision analysis, human interactions and control in challenging scenarios with very low communication bandwidth, such as deep space exploration, battlefield…

Computer Vision and Pattern Recognition · Computer Science 2025-11-03 Weiming Chen , Yijia Wang , Zhihan Zhu , Zhihai He

A Hybrid Deep Animation Codec for Low-bitrate Video Conferencing

Deep generative models, and particularly facial animation schemes, can be used in video conferencing applications to efficiently compress a video through a sparse set of keypoints, without the need to transmit dense motion vectors. While…

Multimedia · Computer Science 2022-07-28 Goluck Konuko , Stéphane Lathuilière , Giuseppe Valenzise

Robust Emotion Recognition from Low Quality and Low Bit Rate Video: A Deep Learning Approach

Emotion recognition from facial expressions is tremendously useful, especially when coupled with smart devices and wireless multimedia applications. However, the inadequate network bandwidth often limits the spatial resolution of the…

Computer Vision and Pattern Recognition · Computer Science 2017-09-12 Bowen Cheng , Zhangyang Wang , Zhaobin Zhang , Zhu Li , Ding Liu , Jianchao Yang , Shuai Huang , Thomas S. Huang

Deep Multi-modality Soft-decoding of Very Low Bit-rate Face Videos

We propose a novel deep multi-modality neural network for restoring very low bit rate videos of talking heads. Such video contents are very common in social media, teleconferencing, distance education, tele-medicine, etc., and often need to…

Computer Vision and Pattern Recognition · Computer Science 2020-08-05 Yanhui Guo , Xi Zhang , Xiaolin Wu

Adversarial Video Compression Guided by Soft Edge Detection

We propose a video compression framework using conditional Generative Adversarial Networks (GANs). We rely on two encoders: one that deploys a standard video codec and another which generates low-level maps via a pipeline of down-sampling,…

Image and Video Processing · Electrical Eng. & Systems 2018-11-28 Sungsoo Kim , Jin Soo Park , Christos G. Bampis , Jaeseong Lee , Mia K. Markey , Alexandros G. Dimakis , Alan C. Bovik

Lightweight High-Fidelity Low-Bitrate Talking Face Compression for 3D Video Conference

The demand for immersive and interactive communication has driven advancements in 3D video conferencing, yet achieving high-fidelity 3D talking face representation at low bitrates remains a challenge. Traditional 2D video compression…

Computer Vision and Pattern Recognition · Computer Science 2026-01-30 Jianglong Li , Jun Xu , Bingcong Lu , Zhengxue Cheng , Hongwei Hu , Ronghua Wu , Li Song

Audio-Visual Driven Compression for Low-Bitrate Talking Head Videos

Talking head video compression has advanced with neural rendering and keypoint-based methods, but challenges remain, especially at low bit rates, including handling large head movements, suboptimal lip synchronization, and distorted facial…

Image and Video Processing · Electrical Eng. & Systems 2025-06-17 Riku Takahashi , Ryugo Morita , Jinjia Zhou

High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model

Audio-driven talking face video generation has attracted increasing attention due to its huge industrial potential. Some previous methods focus on learning a direct mapping from audio to visual content. Despite progress, they often struggle…

Computer Vision and Pattern Recognition · Computer Science 2024-08-13 Weizhi Zhong , Junfan Lin , Peixin Chen , Liang Lin , Guanbin Li

Landmark Guided Visual Feature Extractor for Visual Speech Recognition with Limited Resource

Visual speech recognition is a technique to identify spoken content in silent speech videos, which has raised significant attention in recent years. Advancements in data-driven deep learning methods have significantly improved both the…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Lei Yang , Junshan Jin , Mingyuan Zhang , Yi He , Bofan Chen , Shilin Wang

Gemino: Practical and Robust Neural Compression for Video Conferencing

Video conferencing systems suffer from poor user experience when network conditions deteriorate because current video codecs simply cannot operate at extremely low bitrates. Recently, several neural alternatives have been proposed that…

Networking and Internet Architecture · Computer Science 2023-10-23 Vibhaalakshmi Sivaraman , Pantea Karimi , Vedantha Venkatapathy , Mehrdad Khani , Sadjad Fouladi , Mohammad Alizadeh , Frédo Durand , Vivienne Sze

Reconstructing Faces from fMRI Patterns using Deep Generative Neural Networks

While objects from different categories can be reliably decoded from fMRI brain response patterns, it has proved more difficult to distinguish visually similar inputs, such as different instances of the same category. Here, we apply a…

Human-Computer Interaction · Computer Science 2021-02-23 Rufin VanRullen , Leila Reddy

A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation

Virtual humans have gained considerable attention in numerous industries, e.g., entertainment and e-commerce. As a core technology, synthesizing photorealistic face frames from target speech and facial identity has been actively studied…

Sound · Computer Science 2023-05-01 Bo-Kyeong Kim , Jaemin Kang , Daeun Seo , Hancheol Park , Shinkook Choi , Hyoung-Kyu Song , Hyungshin Kim , Sungsu Lim

Feedback Recurrent Autoencoder for Video Compression

Recent advances in deep generative modeling have enabled efficient modeling of high dimensional data distributions and opened up a new horizon for solving data compression problems. Specifically, autoencoder based learned image or video…

Machine Learning · Computer Science 2020-04-10 Adam Golinski , Reza Pourreza , Yang Yang , Guillaume Sautiere , Taco S Cohen

Hierarchical Cross-Modal Talking Face Generationwith Dynamic Pixel-Wise Loss

We devise a cascade GAN approach to generate talking face video, which is robust to different face shapes, view angles, facial characteristics, and noisy audio conditions. Instead of learning a direct mapping from audio to video frames, we…

Computer Vision and Pattern Recognition · Computer Science 2019-05-13 Lele Chen , Ross K. Maddox , Zhiyao Duan , Chenliang Xu

VineetVC: Adaptive Video Conferencing Under Severe Bandwidth Constraints Using Audio-Driven Talking-Head Reconstruction

Intense bandwidth depletion within consumer and constrained networks has the potential to undermine the stability of real-time video conferencing: encoder rate management becomes saturated, packet loss escalates, frame rates deteriorate,…

Image and Video Processing · Electrical Eng. & Systems 2026-02-16 Vineet Kumar Rakesh , Soumya Mazumdar , Tapas Samanta , Hemendra Kumar Pandey , Amitabha Das , Sarbajit Pal

Compressing Video Calls using Synthetic Talking Heads

We leverage the modern advancements in talking head generation to propose an end-to-end system for talking head video compression. Our algorithm transmits pivot frames intermittently while the rest of the talking head video is generated by…

Computer Vision and Pattern Recognition · Computer Science 2022-10-10 Madhav Agarwal , Anchit Gupta , Rudrabha Mukhopadhyay , Vinay P. Namboodiri , C V Jawahar

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose

Real-world talking faces often accompany with natural head movement. However, most existing talking face video generation methods only consider facial animation with fixed head pose. In this paper, we address this problem by proposing a…

Computer Vision and Pattern Recognition · Computer Science 2020-03-06 Ran Yi , Zipeng Ye , Juyong Zhang , Hujun Bao , Yong-Jin Liu