English
Related papers

Related papers: Multi-modal Pose Diffuser: A Multimodal Generative…

200 papers

The Skinned Multi-Person Linear (SMPL) model can represent a human body by mapping pose and shape parameters to body meshes. This has been shown to facilitate inferring 3D human pose and shape from images via different learning models.…

Computer Vision and Pattern Recognition · Computer Science 2021-12-09 Andrey Davydov , Anastasia Remizova , Victor Constantin , Sina Honari , Mathieu Salzmann , Pascal Fua

3D human pose estimation from 2D images is a challenging problem due to depth ambiguity and occlusion. Because of these challenges the task is underdetermined, where there exists multiple -- possibly infinite -- poses that are plausible…

Computer Vision and Pattern Recognition · Computer Science 2026-02-04 Francis Snelgar , Ming Xu , Stephen Gould , Liang Zheng , Akshay Asthana

The Latent Diffusion Model (LDM) has demonstrated strong capabilities in high-resolution image generation and has been widely employed for Pose-Guided Person Image Synthesis (PGPIS), yielding promising results. However, the compression…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Jiaqi Liu , Jichao Zhang , Paolo Rota , Nicu Sebe

Generating consistent human images with controllable pose and appearance is essential for applications in virtual try on, image editing, and digital human creation. Current methods often suffer from occlusions, garment style drift, and pose…

Computer Vision and Pattern Recognition · Computer Science 2025-12-18 Ziyu Shang , Haoran Liu , Rongchao Zhang , Zhiqian Wei , Tongtong Feng

Traditionally, monocular 3D human pose estimation employs a machine learning model to predict the most likely 3D pose for a given input image. However, a single image can be highly ambiguous and induces multiple plausible solutions for the…

Computer Vision and Pattern Recognition · Computer Science 2022-11-30 Karl Holmquist , Bastian Wandt

Human motion synthesis is an important task in computer graphics and computer vision. While focusing on various conditioning signals such as text, action class, or audio to guide the generation process, most existing methods utilize…

Computer Vision and Pattern Recognition · Computer Science 2024-05-14 Kebing Xue , Hyewon Seo

This work targets to construct a robust human pose prior. However, it remains a persistent challenge due to biomechanical constraints and diverse human movements. Traditional priors like VAEs and NDFs often exhibit shortcomings in realism…

Computer Vision and Pattern Recognition · Computer Science 2024-03-26 Junzhe Lu , Jing Lin , Hongkun Dou , Ailing Zeng , Yue Deng , Yulun Zhang , Haoqian Wang

Continuous diffusion models have demonstrated their effectiveness in addressing the inherent uncertainty and indeterminacy in monocular 3D human pose estimation (HPE). Despite their strengths, the need for large search spaces and the…

Computer Vision and Pattern Recognition · Computer Science 2024-05-28 Weiquan Wang , Jun Xiao , Chunping Wang , Wei Liu , Zhao Wang , Long Chen

Inferring 3D human motion is fundamental in many applications, including understanding human activity and analyzing one's intention. While many fruitful efforts have been made to human motion prediction, most approaches focus on pose-driven…

Computer Vision and Pattern Recognition · Computer Science 2024-05-31 Xuehao Gao , Yang Yang , Yang Wu , Shaoyi Du , Guo-Jun Qi

We present DPoser-X, a diffusion-based prior model for 3D whole-body human poses. Building a versatile and robust full-body human pose prior remains challenging due to the inherent complexity of articulated human poses and the scarcity of…

Computer Vision and Pattern Recognition · Computer Science 2025-08-05 Junzhe Lu , Jing Lin , Hongkun Dou , Ailing Zeng , Yue Deng , Xian Liu , Zhongang Cai , Lei Yang , Yulun Zhang , Haoqian Wang , Ziwei Liu

Human pose estimation (HPE) from Radio Frequency vision (RF-vision) performs human sensing using RF signals that penetrate obstacles without revealing privacy (e.g., facial information). Recently, mmWave radar has emerged as a promising…

Computer Vision and Pattern Recognition · Computer Science 2024-07-23 Junqiao Fan , Jianfei Yang , Yuecong Xu , Lihua Xie

3D human shape reconstruction under severe occlusion due to human-object or human-human interaction is a challenging problem. Parametric models i.e., SMPL(-X), which are based on the statistics across human shapes, can represent whole human…

Computer Vision and Pattern Recognition · Computer Science 2024-10-30 Donghwan Kim , Tae-Kyun Kim

Aligning multiple modalities in a latent space, such as images and texts, has shown to produce powerful semantic visual representations, fueling tasks like image captioning, text-to-image generation, or image grounding. In the context of…

Computer Vision and Pattern Recognition · Computer Science 2024-09-11 Ginger Delmas , Philippe Weinzaepfel , Francesc Moreno-Noguer , Grégory Rogez

We present a methodology for conditional control of human shape and pose in pretrained text-to-image diffusion models using a 3D human parametric model (SMPL). Fine-tuning these diffusion models to adhere to new conditions requires large…

Computer Vision and Pattern Recognition · Computer Science 2024-11-08 Benito Buchheim , Max Reimann , Jürgen Döllner

Recent work has showcased the significant potential of diffusion models in pose-guided person image synthesis. However, owing to the inconsistency in pose between the source and target images, synthesizing an image with a distinct pose,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Fei Shen , Hu Ye , Jun Zhang , Cong Wang , Xiao Han , Wei Yang

Robust in-bed human pose estimation under blanket occlusion remains challenging due to the scarcity of reliable labeled training data for heavily covered poses. Existing approaches rely on multi-modal sensing or image-to-image translation…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Navid Aslankhani Khameneh , Marco Carletti , Cigdem Beyan

3D human pose estimation (3D HPE) has emerged as a prominent research topic, particularly in the realm of RGB-based methods. However, the use of RGB images is often limited by issues such as occlusion and privacy constraints. Consequently,…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Mengshi Qi , Jiaxuan Peng , Xianlin Zhang , Huadong Ma

Denoising diffusion probabilistic models that were initially proposed for realistic image generation have recently shown success in various perception tasks (e.g., object detection and image segmentation) and are increasingly gaining…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Runyang Feng , Yixing Gao , Tze Ho Elden Tse , Xueqing Ma , Hyung Jin Chang

Language is often used to describe physical interaction, yet most 3D human pose estimation methods overlook this rich source of information. We bridge this gap by leveraging large multimodal models (LMMs) as priors for reconstructing…

Computer Vision and Pattern Recognition · Computer Science 2025-05-16 Sanjay Subramanian , Evonne Ng , Lea Müller , Dan Klein , Shiry Ginosar , Trevor Darrell

Previous probabilistic models for 3D Human Pose Estimation (3DHPE) aimed to enhance pose accuracy by generating multiple hypotheses. However, most of the hypotheses generated deviate substantially from the true pose. Compared to…

Computer Vision and Pattern Recognition · Computer Science 2024-01-11 Hongbo Kang , Yong Wang , Mengyuan Liu , Doudou Wu , Peng Liu , Xinlin Yuan , Wenming Yang
‹ Prev 1 2 3 10 Next ›