Related papers: Make-Your-Anchor: A Diffusion-based 2D Avatar Gene…

Hyper Diffusion Avatars: Dynamic Human Avatar Generation using Network Weight Space Diffusion

Creating human avatars is a highly desirable yet challenging task. Recent advancements in radiance field rendering have achieved unprecedented photorealism and real-time performance for personalized dynamic human avatars. However, these…

Graphics · Computer Science 2025-09-09 Dongliang Cao , Guoxing Sun , Marc Habermann , Florian Bernard

Instant 3D Human Avatar Generation using Image Diffusion Models

We present AvatarPopUp, a method for fast, high quality 3D human avatar generation from different input modalities, such as images and text prompts and with control over the generated pose and shape. The common theme is the use of…

Computer Vision and Pattern Recognition · Computer Science 2024-07-15 Nikos Kolotouros , Thiemo Alldieck , Enric Corona , Eduard Gabriel Bazavan , Cristian Sminchisescu

Dancing Avatar: Pose and Text-Guided Human Motion Videos Synthesis with Image Diffusion Model

The rising demand for creating lifelike avatars in the digital realm has led to an increased need for generating high-quality human videos guided by textual descriptions and poses. We propose Dancing Avatar, designed to fabricate human…

Computer Vision and Pattern Recognition · Computer Science 2023-08-16 Bosheng Qin , Wentao Ye , Qifan Yu , Siliang Tang , Yueting Zhuang

LLIA -- Enabling Low-Latency Interactive Avatars: Real-Time Audio-Driven Portrait Video Generation with Diffusion Models

Diffusion-based models have gained wide adoption in the virtual human generation due to their outstanding expressiveness. However, their substantial computational requirements have constrained their deployment in real-time interactive…

Computer Vision and Pattern Recognition · Computer Science 2025-06-09 Haojie Yu , Zhaonian Wang , Yihan Pan , Meng Cheng , Hao Yang , Chao Wang , Tao Xie , Xiaoming Xu , Xiaoming Wei , Xunliang Cai

Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion

Animatable head avatar generation typically requires extensive data for training. To reduce the data requirements, a natural solution is to leverage existing data-free static avatar generation methods, such as pre-trained diffusion models…

Computer Vision and Pattern Recognition · Computer Science 2025-03-26 Zhenglin Zhou , Fan Ma , Hehe Fan , Tat-Seng Chua

Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation

Recent advances in generative diffusion models have enabled the previously unfeasible capability of generating 3D assets from a single input image or a text prompt. In this work, we aim to enhance the quality and functionality of these…

Computer Vision and Pattern Recognition · Computer Science 2024-04-03 Xiyi Chen , Marko Mihajlovic , Shaofei Wang , Sergey Prokudin , Siyu Tang

DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars

DiffusionAvatars synthesizes a high-fidelity 3D head avatar of a person, offering intuitive control over both pose and expression. We propose a diffusion-based neural renderer that leverages generic 2D priors to produce compelling images of…

Computer Vision and Pattern Recognition · Computer Science 2024-04-18 Tobias Kirschstein , Simon Giebenhain , Matthias Nießner

VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis

We propose VLOGGER, a method for audio-driven human video generation from a single input image of a person, which builds on the success of recent generative diffusion models. Our method consists of 1) a stochastic human-to-3d-motion…

Computer Vision and Pattern Recognition · Computer Science 2024-03-14 Enric Corona , Andrei Zanfir , Eduard Gabriel Bazavan , Nikos Kolotouros , Thiemo Alldieck , Cristian Sminchisescu

AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation

We introduce AvatarBooth, a novel method for generating high-quality 3D avatars using text prompts or specific images. Unlike previous approaches that can only synthesize avatars based on simple text descriptions, our method enables the…

Computer Vision and Pattern Recognition · Computer Science 2023-06-19 Yifei Zeng , Yuanxun Lu , Xinya Ji , Yao Yao , Hao Zhu , Xun Cao

StreamAvatar: Streaming Diffusion Models for Real-Time Interactive Human Avatars

Real-time, streaming interactive avatars represent a critical yet challenging goal in digital human research. Although diffusion-based human avatar generation methods achieve remarkable success, their non-causal architecture and high…

Computer Vision and Pattern Recognition · Computer Science 2026-03-31 Zhiyao Sun , Ziqiao Peng , Yifeng Ma , Yi Chen , Zhengguang Zhou , Zixiang Zhou , Guozhen Zhang , Youliang Zhang , Yuan Zhou , Qinglin Lu , Yong-Jin Liu

ConsistentAvatar: Learning to Diffuse Fully Consistent Talking Head Avatar with Temporal Guidance

Diffusion models have shown impressive potential on talking head generation. While plausible appearance and talking effect are achieved, these methods still suffer from temporal, 3D or expression inconsistency due to the error accumulation…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Haijie Yang , Zhenyu Zhang , Hao Tang , Jianjun Qian , Jian Yang

Anchored Diffusion for Video Face Reenactment

Video generation has drawn significant interest recently, pushing the development of large-scale models capable of producing realistic videos with coherent motion. Due to memory constraints, these models typically generate short video…

Computer Vision and Pattern Recognition · Computer Science 2024-07-23 Idan Kligvasser , Regev Cohen , George Leifman , Ehud Rivlin , Michael Elad

FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability

Over recent years, diffusion models have facilitated significant advancements in video generation. Yet, the creation of face-related videos still confronts issues such as low facial fidelity, lack of frame consistency, limited editability…

Computer Vision and Pattern Recognition · Computer Science 2023-12-22 Linze Li , Sunqi Fan , Hengjun Pu , Zhaodong Bing , Yao Tang , Tianzhu Ye , Tong Yang , Liangyu Chen , Jiajun Liang

TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model

Recently, 2D speaking avatars have increasingly participated in everyday scenarios due to the fast development of facial animation techniques. However, most existing works neglect the explicit control of human bodies. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Jiazhi Guan , Quanwei Yang , Kaisiyuan Wang , Hang Zhou , Shengyi He , Zhiliang Xu , Haocheng Feng , Errui Ding , Jingdong Wang , Hongtao Xie , Youjian Zhao , Ziwei Liu

Disentangled Clothed Avatar Generation with Layered Representation

Clothed avatar generation has wide applications in virtual and augmented reality, filmmaking, and more. Previous methods have achieved success in generating diverse digital avatars, however, generating avatars with disentangled components…

Computer Vision and Pattern Recognition · Computer Science 2025-09-08 Weitian Zhang , Yichao Yan , Sijing Wu , Manwen Liao , Xiaokang Yang

MVP4D: Multi-View Portrait Video Diffusion for Animatable 4D Avatars

Digital human avatars aim to simulate the dynamic appearance of humans in virtual environments, enabling immersive experiences across gaming, film, virtual reality, and more. However, the conventional process for creating and animating…

Computer Vision and Pattern Recognition · Computer Science 2025-10-15 Felix Taubner , Ruihang Zhang , Mathieu Tuli , Sherwin Bahmani , David B. Lindell

Multimodal Generation of Animatable 3D Human Models with AvatarForge

We introduce AvatarForge, a framework for generating animatable 3D human avatars from text or image inputs using AI-driven procedural generation. While diffusion-based methods have made strides in general 3D object generation, they struggle…

Computer Vision and Pattern Recognition · Computer Science 2025-03-12 Xinhang Liu , Yu-Wing Tai , Chi-Keung Tang

Stable Video-Driven Portraits

Portrait animation aims to generate photo-realistic videos from a single source image by reenacting the expression and pose from a driving video. While early methods relied on 3D morphable models or feature warping techniques, they often…

Computer Vision and Pattern Recognition · Computer Science 2025-09-23 Mallikarjun B. R. , Fei Yin , Vikram Voleti , Nikita Drobyshev , Maksim Lapin , Aaryaman Vasishta , Varun Jampani

Articulated 3D Head Avatar Generation using Text-to-Image Diffusion Models

The ability to generate diverse 3D articulated head avatars is vital to a plethora of applications, including augmented reality, cinematography, and education. Recent work on text-guided 3D object generation has shown great promise in…

Computer Vision and Pattern Recognition · Computer Science 2023-07-12 Alexander W. Bergman , Wang Yifan , Gordon Wetzstein

FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion model

Diffusion-based video generation techniques have significantly improved zero-shot talking-head avatar generation, enhancing the naturalness of both head motion and facial expressions. However, existing methods suffer from poor…

Graphics · Computer Science 2025-04-24 Lingzhou Mu , Baiji Liu , Ruonan Zhang , Guiming Mo , Jiawei Jin , Kai Zhang , Haozhi Huang