Related papers: DPHMs: Diffusion Parametric Head Models for Depth-…

DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars

DiffusionAvatars synthesizes a high-fidelity 3D head avatar of a person, offering intuitive control over both pose and expression. We propose a diffusion-based neural renderer that leverages generic 2D priors to produce compelling images of…

Computer Vision and Pattern Recognition · Computer Science 2024-04-18 Tobias Kirschstein , Simon Giebenhain , Matthias Nießner

FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models

We introduce FaceTalk, a novel generative approach designed for synthesizing high-fidelity 3D motion sequences of talking human heads from input audio signal. To capture the expressive, detailed nature of human heads, including hair, ears,…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Shivangi Aneja , Justus Thies , Angela Dai , Matthias Nießner

MonoNPHM: Dynamic Head Reconstruction from Monocular Videos

We present Monocular Neural Parametric Head Models (MonoNPHM) for dynamic 3D head reconstructions from monocular RGB videos. To this end, we propose a latent appearance space that parameterizes a texture field on top of a neural parametric…

Computer Vision and Pattern Recognition · Computer Science 2024-05-31 Simon Giebenhain , Tobias Kirschstein , Markos Georgopoulos , Martin Rünz , Lourdes Agapito , Matthias Nießner

Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation

Talking face generation has historically struggled to produce head movements and natural facial expressions without guidance from additional reference videos. Recent developments in diffusion-based generative models allow for more realistic…

Computer Vision and Pattern Recognition · Computer Science 2023-08-01 Michał Stypułkowski , Konstantinos Vougioukas , Sen He , Maciej Zięba , Stavros Petridis , Maja Pantic

High-Resolution Image Synthesis with Latent Diffusion Models

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Additionally, their formulation allows for a…

Computer Vision and Pattern Recognition · Computer Science 2022-04-14 Robin Rombach , Andreas Blattmann , Dominik Lorenz , Patrick Esser , Björn Ommer

RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation

While latent diffusion models (LDMs), such as Stable Diffusion, are designed for high-resolution (HR) image generation, they often struggle with significant structural distortions when generating images at resolutions higher than their…

Computer Vision and Pattern Recognition · Computer Science 2025-12-09 Boyuan Cao , Jiaxin Ye , Yujie Wei , Hongming Shan

DiffBFR: Bootstrapping Diffusion Model Towards Blind Face Restoration

Blind face restoration (BFR) is important while challenging. Prior works prefer to exploit GAN-based frameworks to tackle this task due to the balance of quality and efficiency. However, these methods suffer from poor stability and…

Computer Vision and Pattern Recognition · Computer Science 2023-08-09 Xinmin Qiu , Congying Han , Zicheng Zhang , Bonan Li , Tiande Guo , Xuecheng Nie

DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis

Audio-driven talking head synthesis strives to generate lifelike video portraits from provided audio. The diffusion model, recognized for its superior quality and robust generalization, has been explored for this task. However, establishing…

Multimedia · Computer Science 2024-09-17 Fa-Ting Hong , Yunfei Liu , Yu Li , Changyin Zhou , Fei Yu , Dan Xu

RDPM: Solve Diffusion Probabilistic Models via Recurrent Token Prediction

Diffusion Probabilistic Models (DPMs) have emerged as the de facto approach for high-fidelity image synthesis, operating diffusion processes on continuous VAE latent, which significantly differ from the text generation methods employed by…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Xiaoping Wu , Jie Hu , Xiaoming Wei

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data…

Computer Vision and Pattern Recognition · Computer Science 2023-10-16 Zhengxiong Luo , Dayou Chen , Yingya Zhang , Yan Huang , Liang Wang , Yujun Shen , Deli Zhao , Jingren Zhou , Tieniu Tan

FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model

Talking head generation is a significant research topic that still faces numerous challenges. Previous works often adopt generative adversarial networks or regression models, which are plagued by generation quality and average facial shape…

Computer Vision and Pattern Recognition · Computer Science 2024-08-20 Ziyu Yao , Xuxin Cheng , Zhiqi Huang

DP-LDMs: Differentially Private Latent Diffusion Models

Diffusion models (DMs) are one of the most widely used generative models for producing high quality images. However, a flurry of recent papers points out that DMs are least private forms of image generators, by extracting a significant…

Machine Learning · Statistics 2025-03-06 Michael F. Liu , Saiyue Lyu , Margarita Vinaroz , Mijung Park

DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion

Diffusion-based generative models have exhibited powerful generative performance in recent years. However, as many attributes exist in the data distribution and owing to several limitations of sharing the model parameters across all levels…

Audio and Speech Processing · Electrical Eng. & Systems 2023-05-26 Ha-Yeong Choi , Sang-Hoon Lee , Seong-Whan Lee

PHDME: Physics-Informed Diffusion Models without Explicit Governing Equations

Diffusion models provide expressive priors for forecasting trajectories of dynamical systems, but are typically unreliable in the sparse data regime. Physics-informed machine learning (PIML) improves reliability in such settings; however,…

Machine Learning · Computer Science 2026-01-30 Kaiyuan Tan , Kendra Givens , Peilun Li , Thomas Beckers

Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall

Discrete diffusion models offer a promising alternative to autoregressive generation through parallel decoding, but they suffer from a sampling wall: once categorical sampling occurs, rich distributional information collapses into one-hot…

Machine Learning · Computer Science 2026-05-14 Mingyu Jo , Jaesik Yoon , Justin Deschenaux , Caglar Gulcehre , Sungjin Ahn

RoHM: Robust Human Motion Reconstruction via Diffusion

We propose RoHM, an approach for robust 3D human motion reconstruction from monocular RGB(-D) videos in the presence of noise and occlusions. Most previous approaches either train neural networks to directly regress motion in 3D or learn…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Siwei Zhang , Bharat Lal Bhatnagar , Yuanlu Xu , Alexander Winkler , Petr Kadlecek , Siyu Tang , Federica Bogo

Exploring Latent Cross-Channel Embedding for Accurate 3D Human Pose Reconstruction in a Diffusion Framework

Monocular 3D human pose estimation poses significant challenges due to the inherent depth ambiguities that arise during the reprojection process from 2D to 3D. Conventional approaches that rely on estimating an over-fit projection matrix…

Computer Vision and Pattern Recognition · Computer Science 2024-01-19 Junkun Jiang , Jie Chen

Diffusion-Guided Pretraining for Brain Graph Foundation Models

With the growing interest in foundation models for brain signals, graph-based pretraining has emerged as a promising paradigm for learning transferable representations from connectome data. However, existing contrastive and masked…

Machine Learning · Computer Science 2026-03-10 Xinxu Wei , Rong Zhou , Lifang He , Yu Zhang

ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration

While recent works on blind face image restoration have successfully produced impressive high-quality (HQ) images with abundant details from low-quality (LQ) input images, the generated content may not accurately reflect the real appearance…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Chi-Wei Hsiao , Yu-Lun Liu , Cheng-Kun Yang , Sheng-Po Kuo , Kevin Jou , Chia-Ping Chen

GMODiff: One-Step Gain Map Refinement with Diffusion Priors for HDR Reconstruction

Pre-trained Latent Diffusion Models (LDMs) have recently shown strong perceptual priors for low-level vision tasks, making them a promising direction for multi-exposure High Dynamic Range (HDR) reconstruction. However, directly applying…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Tao Hu , Weiyu Zhou , Yanjie Tu , Peng Wu , Wei Dong , Qingsen Yan , Yanning Zhang