Related papers: Multi-modal Pose Diffuser: A Multimodal Generative…

Adversarial Parametric Pose Prior

The Skinned Multi-Person Linear (SMPL) model can represent a human body by mapping pose and shape parameters to body meshes. This has been shown to facilitate inferring 3D human pose and shape from images via different learning models.…

Computer Vision and Pattern Recognition · Computer Science 2021-12-09 Andrey Davydov , Anastasia Remizova , Victor Constantin , Sina Honari , Mathieu Salzmann , Pascal Fua

Flexible Geometric Guidance for Probabilistic Human Pose Estimation with Diffusion Models

3D human pose estimation from 2D images is a challenging problem due to depth ambiguity and occlusion. Because of these challenges the task is underdetermined, where there exists multiple -- possibly infinite -- poses that are plausible…

Computer Vision and Pattern Recognition · Computer Science 2026-02-04 Francis Snelgar , Ming Xu , Stephen Gould , Liang Zheng , Akshay Asthana

Multi-focal Conditioned Latent Diffusion for Person Image Synthesis

The Latent Diffusion Model (LDM) has demonstrated strong capabilities in high-resolution image generation and has been widely employed for Pose-Guided Person Image Synthesis (PGPIS), yielding promising results. However, the compression…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Jiaqi Liu , Jichao Zhang , Paolo Rota , Nicu Sebe

PMMD: A pose-guided multi-view multi-modal diffusion for person generation

Generating consistent human images with controllable pose and appearance is essential for applications in virtual try on, image editing, and digital human creation. Current methods often suffer from occlusions, garment style drift, and pose…

Computer Vision and Pattern Recognition · Computer Science 2025-12-18 Ziyu Shang , Haoran Liu , Rongchao Zhang , Zhiqian Wei , Tongtong Feng

DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion models

Traditionally, monocular 3D human pose estimation employs a machine learning model to predict the most likely 3D pose for a given input image. However, a single image can be highly ambiguous and induces multiple plausible solutions for the…

Computer Vision and Pattern Recognition · Computer Science 2022-11-30 Karl Holmquist , Bastian Wandt

Shape Conditioned Human Motion Generation with Diffusion Model

Human motion synthesis is an important task in computer graphics and computer vision. While focusing on various conditioning signals such as text, action class, or audio to guide the generation process, most existing methods utilize…

Computer Vision and Pattern Recognition · Computer Science 2024-05-14 Kebing Xue , Hyewon Seo

DPoser: Diffusion Model as Robust 3D Human Pose Prior

This work targets to construct a robust human pose prior. However, it remains a persistent challenge due to biomechanical constraints and diverse human movements. Traditional priors like VAEs and NDFs often exhibit shortcomings in realism…

Computer Vision and Pattern Recognition · Computer Science 2024-03-26 Junzhe Lu , Jing Lin , Hongkun Dou , Ailing Zeng , Yue Deng , Yulun Zhang , Haoqian Wang

$\text{Di}^2\text{Pose}$: Discrete Diffusion Model for Occluded 3D Human Pose Estimation

Continuous diffusion models have demonstrated their effectiveness in addressing the inherent uncertainty and indeterminacy in monocular 3D human pose estimation (HPE). Despite their strengths, the need for large search spaces and the…

Computer Vision and Pattern Recognition · Computer Science 2024-05-28 Weiquan Wang , Jun Xiao , Chunping Wang , Wei Liu , Zhao Wang , Long Chen

Multi-Condition Latent Diffusion Network for Scene-Aware Neural Human Motion Prediction

Inferring 3D human motion is fundamental in many applications, including understanding human activity and analyzing one's intention. While many fruitful efforts have been made to human motion prediction, most approaches focus on pose-driven…

Computer Vision and Pattern Recognition · Computer Science 2024-05-31 Xuehao Gao , Yang Yang , Yang Wu , Shaoyi Du , Guo-Jun Qi

DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior

We present DPoser-X, a diffusion-based prior model for 3D whole-body human poses. Building a versatile and robust full-body human pose prior remains challenging due to the inherent complexity of articulated human poses and the scarcity of…

Computer Vision and Pattern Recognition · Computer Science 2025-08-05 Junzhe Lu , Jing Lin , Hongkun Dou , Ailing Zeng , Yue Deng , Xian Liu , Zhongang Cai , Lei Yang , Yulun Zhang , Haoqian Wang , Ziwei Liu

Diffusion Model is a Good Pose Estimator from 3D RF-Vision

Human pose estimation (HPE) from Radio Frequency vision (RF-vision) performs human sensing using RF signals that penetrate obstacles without revealing privacy (e.g., facial information). Recently, mmWave radar has emerged as a promising…

Computer Vision and Pattern Recognition · Computer Science 2024-07-23 Junqiao Fan , Jianfei Yang , Yuecong Xu , Lihua Xie

Multi-hypotheses Conditioned Point Cloud Diffusion for 3D Human Reconstruction from Occluded Images

3D human shape reconstruction under severe occlusion due to human-object or human-human interaction is a challenging problem. Parametric models i.e., SMPL(-X), which are based on the statistics across human shapes, can represent whole human…

Computer Vision and Pattern Recognition · Computer Science 2024-10-30 Donghwan Kim , Tae-Kyun Kim

PoseEmbroider: Towards a 3D, Visual, Semantic-aware Human Pose Representation

Aligning multiple modalities in a latent space, such as images and texts, has shown to produce powerful semantic visual representations, fueling tasks like image captioning, text-to-image generation, or image grounding. In the context of…

Computer Vision and Pattern Recognition · Computer Science 2024-09-11 Ginger Delmas , Philippe Weinzaepfel , Francesc Moreno-Noguer , Grégory Rogez

Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation

We present a methodology for conditional control of human shape and pose in pretrained text-to-image diffusion models using a 3D human parametric model (SMPL). Fine-tuning these diffusion models to adhere to new conditions requires large…

Computer Vision and Pattern Recognition · Computer Science 2024-11-08 Benito Buchheim , Max Reimann , Jürgen Döllner

Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models

Recent work has showcased the significant potential of diffusion models in pose-guided person image synthesis. However, owing to the inconsistency in pose between the source and target images, synthesizing an image with a distinct pose,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Fei Shen , Hu Ye , Jun Zhang , Cong Wang , Xiao Han , Wei Yang

Geometry-Conditioned Diffusion for Occlusion-Robust In-Bed Pose Estimation

Robust in-bed human pose estimation under blanket occlusion remains challenging due to the scarcity of reliable labeled training data for heavily covered poses. Existing approaches rely on multi-modal sensing or image-to-image translation…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Navid Aslankhani Khameneh , Marco Carletti , Cigdem Beyan

Towards Balanced Multi-Modal Learning in 3D Human Pose Estimation

3D human pose estimation (3D HPE) has emerged as a prominent research topic, particularly in the realm of RGB-based methods. However, the use of RGB images is often limited by issues such as occlusion and privacy constraints. Consequently,…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Mengshi Qi , Jiaxuan Peng , Xianlin Zhang , Huadong Ma

DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose Estimation

Denoising diffusion probabilistic models that were initially proposed for realistic image generation have recently shown success in various perception tasks (e.g., object detection and image segmentation) and are increasingly gaining…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Runyang Feng , Yixing Gao , Tze Ho Elden Tse , Xueqing Ma , Hyung Jin Chang

Pose Priors from Language Models

Language is often used to describe physical interaction, yet most 3D human pose estimation methods overlook this rich source of information. We bridge this gap by leveraging large multimodal models (LMMs) as priors for reconstructing…

Computer Vision and Pattern Recognition · Computer Science 2025-05-16 Sanjay Subramanian , Evonne Ng , Lea Müller , Dan Klein , Shiry Ginosar , Trevor Darrell

Diffusion-based Pose Refinement and Muti-hypothesis Generation for 3D Human Pose Estimaiton

Previous probabilistic models for 3D Human Pose Estimation (3DHPE) aimed to enhance pose accuracy by generating multiple hypotheses. However, most of the hypotheses generated deviate substantially from the true pose. Compared to…

Computer Vision and Pattern Recognition · Computer Science 2024-01-11 Hongbo Kang , Yong Wang , Mengyuan Liu , Doudou Wu , Peng Liu , Xinlin Yuan , Wenming Yang