Related papers: LEAD: Latent Realignment for Human Motion Diffusio…

Executing your Commands via Motion Diffusion in Latent Space

We study a challenging task, conditional human motion generation, which produces plausible human motion sequences according to various conditional inputs, such as action classes or textual descriptors. Since human motions are highly diverse…

Computer Vision and Pattern Recognition · Computer Science 2023-05-22 Xin Chen , Biao Jiang , Wen Liu , Zilong Huang , Bin Fu , Tao Chen , Jingyi Yu , Gang Yu

Length-Aware Motion Synthesis via Latent Diffusion

The target duration of a synthesized human motion is a critical attribute that requires modeling control over the motion dynamics and style. Speeding up an action performance is not merely fast-forwarding it. However, state-of-the-art…

Computer Vision and Pattern Recognition · Computer Science 2024-07-17 Alessio Sampieri , Alessio Palma , Indro Spinelli , Fabio Galasso

ReAlign: Bilingual Text-to-Motion Generation via Step-Aware Reward-Guided Alignment

Bilingual text-to-motion generation, which synthesizes 3D human motions from bilingual text inputs, holds immense potential for cross-linguistic applications in gaming, film, and robotics. However, this task faces critical challenges: the…

Computer Vision and Pattern Recognition · Computer Science 2025-08-04 Wanjiang Weng , Xiaofeng Tan , Hongsong Wang , Pan Zhou

LS-GAN: Human Motion Synthesis with Latent-space GANs

Human motion synthesis conditioned on textual input has gained significant attention in recent years due to its potential applications in various domains such as gaming, film production, and virtual reality. Conditioned Motion synthesis…

Computer Vision and Pattern Recognition · Computer Science 2025-01-06 Avinash Amballa , Gayathri Akkinapalli , Vinitra Muralikrishnan

Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model

Text-guided motion synthesis aims to generate 3D human motion that not only precisely reflects the textual description but reveals the motion details as much as possible. Pioneering methods explore the diffusion model for text-to-motion…

Computer Vision and Pattern Recognition · Computer Science 2023-12-19 Zhenyu Xie , Yang Wu , Xuehao Gao , Zhongqian Sun , Wei Yang , Xiaodan Liang

Priority-Centric Human Motion Generation in Discrete Latent Space

Text-to-motion generation is a formidable task, aiming to produce human motions that align with the input text while also adhering to human capabilities and physical laws. While there have been advancements in diffusion models, their…

Computer Vision and Pattern Recognition · Computer Science 2023-08-31 Hanyang Kong , Kehong Gong , Dongze Lian , Michael Bi Mi , Xinchao Wang

Diffusion Motion: Generate Text-Guided 3D Human Motion by Diffusion Model

We propose a simple and novel method for generating 3D human motion from complex natural language sentences, which describe different velocity, direction and composition of all kinds of actions. Different from existing methods that use…

Computer Vision and Pattern Recognition · Computer Science 2023-04-17 Zhiyuan Ren , Zhihong Pan , Xin Zhou , Le Kang

SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing

Text-driven motion generation has advanced significantly with the rise of denoising diffusion models. However, previous methods often oversimplify representations for the skeletal joints, temporal frames, and textual words, limiting their…

Computer Vision and Pattern Recognition · Computer Science 2025-03-19 Seokhyeon Hong , Chaelin Kim , Serin Yoon , Junghyun Nam , Sihun Cha , Junyong Noh

AMD:Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion

Generating realistic human motion sequences from text descriptions is a challenging task that requires capturing the rich expressiveness of both natural language and human motion.Recent advances in diffusion models have enabled significant…

Computer Vision and Pattern Recognition · Computer Science 2023-12-22 Beibei Jing , Youjia Zhang , Zikai Song , Junqing Yu , Wei Yang

ReAlign: Text-to-Motion Generation via Step-Aware Reward-Guided Alignment

Text-to-motion generation, which synthesizes 3D human motions from text inputs, holds immense potential for applications in gaming, film, and robotics. Recently, diffusion-based methods have been shown to generate more diversity and…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Wanjiang Weng , Xiaofeng Tan , Junbo Wang , Guo-Sen Xie , Pan Zhou , Hongsong Wang

Text-driven Human Motion Generation with Motion Masked Diffusion Model

Text-driven human motion generation is a multimodal task that synthesizes human motion sequences conditioned on natural language. It requires the model to satisfy textual descriptions under varying conditional inputs, while generating…

Computer Vision and Pattern Recognition · Computer Science 2024-10-01 Xingyu Chen

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation

Text-to-motion generation requires not only grounding local actions in language but also seamlessly blending these individual actions to synthesize diverse and realistic global motions. However, existing motion generation methods primarily…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Peng Jin , Hao Li , Zesen Cheng , Kehan Li , Runyi Yu , Chang Liu , Xiangyang Ji , Li Yuan , Jie Chen

Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model

Text-driven human motion generation in computer vision is both significant and challenging. However, current methods are limited to producing either deterministic or imprecise motion sequences, failing to effectively control the temporal…

Computer Vision and Pattern Recognition · Computer Science 2023-09-13 Yin Wang , Zhiying Leng , Frederick W. B. Li , Shun-Cheng Wu , Xiaohui Liang

MoLingo: Motion-Language Alignment for Text-to-Motion Generation

We introduce MoLingo, a text-to-motion (T2M) model that generates realistic, lifelike human motion by denoising in a continuous latent space. Recent works perform latent space diffusion, either on the whole latent at once or…

Computer Vision and Pattern Recognition · Computer Science 2026-03-27 Yannan He , Garvita Tiwari , Xiaohan Zhang , Pankaj Bora , Tolga Birdal , Jan Eric Lenssen , Gerard Pons-Moll

LDEdit: Towards Generalized Text Guided Image Manipulation via Latent Diffusion Models

Research in vision-language models has seen rapid developments off-late, enabling natural language-based interfaces for image generation and manipulation. Many existing text guided manipulation techniques are restricted to specific classes…

Computer Vision and Pattern Recognition · Computer Science 2024-05-07 Paramanand Chandramouli , Kanchana Vaishnavi Gandikota

LaMoGen: Laban Movement-Guided Diffusion for Text-to-Motion Generation

Diverse human motion generation is an increasingly important task, having various applications in computer vision, human-computer interaction and animation. While text-to-motion synthesis using diffusion models has shown success in…

Computer Vision and Pattern Recognition · Computer Science 2025-10-14 Heechang Kim , Gwanghyun Kim , Se Young Chun

Synthesizing Long-Term Human Motions with Diffusion Models via Coherent Sampling

Text-to-motion generation has gained increasing attention, but most existing methods are limited to generating short-term motions that correspond to a single sentence describing a single action. However, when a text stream describes a…

Computer Vision and Pattern Recognition · Computer Science 2023-08-04 Zhao Yang , Bing Su , Ji-Rong Wen

FLAME: Free-form Language-based Motion Synthesis & Editing

Text-based motion generation models are drawing a surge of interest for their potential for automating the motion-making process in the game, animation, or robot industries. In this paper, we propose a diffusion-based motion synthesis and…

Computer Vision and Pattern Recognition · Computer Science 2023-01-03 Jihoon Kim , Jiseob Kim , Sungjoon Choi

MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model

Human motion modeling is important for many modern graphics applications, which typically require professional skills. In order to remove the skill barriers for laymen, recent motion generation methods can directly generate human motions…

Computer Vision and Pattern Recognition · Computer Science 2022-09-01 Mingyuan Zhang , Zhongang Cai , Liang Pan , Fangzhou Hong , Xinying Guo , Lei Yang , Ziwei Liu

Text-driven Visual Synthesis with Latent Diffusion Prior

There has been tremendous progress in large-scale text-to-image synthesis driven by diffusion models enabling versatile downstream applications such as 3D object synthesis from texts, image editing, and customized generation. We present a…

Computer Vision and Pattern Recognition · Computer Science 2023-04-05 Ting-Hsuan Liao , Songwei Ge , Yiran Xu , Yao-Chih Lee , Badour AlBahar , Jia-Bin Huang