Related papers: MotionDiffuse: Text-Driven Human Motion Generation…

Human Motion Diffusion Model

Natural and expressive human motion generation is the holy grail of computer animation. It is a challenging task, due to the diversity of possible motion, human perceptual sensitivity to it, and the difficulty of accurately describing it.…

Computer Vision and Pattern Recognition · Computer Science 2022-10-04 Guy Tevet , Sigal Raab , Brian Gordon , Yonatan Shafir , Daniel Cohen-Or , Amit H. Bermano

ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model

3D human motion generation is crucial for creative industry. Recent advances rely on generative models with domain knowledge for text-driven motion generation, leading to substantial progress in capturing common motions. However, the…

Computer Vision and Pattern Recognition · Computer Science 2023-04-04 Mingyuan Zhang , Xinying Guo , Liang Pan , Zhongang Cai , Fangzhou Hong , Huirong Li , Lei Yang , Ziwei Liu

Text-driven Human Motion Generation with Motion Masked Diffusion Model

Text-driven human motion generation is a multimodal task that synthesizes human motion sequences conditioned on natural language. It requires the model to satisfy textual descriptions under varying conditional inputs, while generating…

Computer Vision and Pattern Recognition · Computer Science 2024-10-01 Xingyu Chen

ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model

Generating human motion from textual descriptions is a challenging task. Existing methods either struggle with physical credibility or are limited by the complexities of physics simulations. In this paper, we present \emph{ReinDiffuse} that…

Computer Vision and Pattern Recognition · Computer Science 2024-10-16 Gaoge Han , Mingjiang Liang , Jinglei Tang , Yongkang Cheng , Wei Liu , Shaoli Huang

Towards Robust and Controllable Text-to-Motion via Masked Autoregressive Diffusion

Generating 3D human motion from text descriptions remains challenging due to the diverse and complex nature of human motion. While existing methods excel within the training distribution, they often struggle with out-of-distribution…

Computer Vision and Pattern Recognition · Computer Science 2026-01-09 Zongye Zhang , Bohan Kong , Qingjie Liu , Yunhong Wang

Realistic Human Motion Generation with Cross-Diffusion Models

We introduce the Cross Human Motion Diffusion Model (CrossDiff), a novel approach for generating high-quality human motion based on textual descriptions. Our method integrates 3D and 2D information using a shared transformer network within…

Computer Vision and Pattern Recognition · Computer Science 2024-08-06 Zeping Ren , Shaoli Huang , Xiu Li

MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation

Controllable generation of 3D human motions becomes an important topic as the world embraces digital transformation. Existing works, though making promising progress with the advent of diffusion models, heavily rely on meticulously captured…

Computer Vision and Pattern Recognition · Computer Science 2024-01-25 Nhat M. Hoang , Kehong Gong , Chuan Guo , Michael Bi Mi

Diffusion Motion: Generate Text-Guided 3D Human Motion by Diffusion Model

We propose a simple and novel method for generating 3D human motion from complex natural language sentences, which describe different velocity, direction and composition of all kinds of actions. Different from existing methods that use…

Computer Vision and Pattern Recognition · Computer Science 2023-04-17 Zhiyuan Ren , Zhihong Pan , Xin Zhou , Le Kang

Text-driven Motion Generation: Overview, Challenges and Directions

Text-driven motion generation offers a powerful and intuitive way to create human movements directly from natural language. By removing the need for predefined motion inputs, it provides a flexible and accessible approach to controlling…

Computer Vision and Pattern Recognition · Computer Science 2025-05-15 Ali Rida Sahili , Najett Neji , Hedi Tabia

PhysDiff: Physics-Guided Human Motion Diffusion Model

Denoising diffusion models hold great promise for generating diverse and realistic human motions. However, existing motion diffusion models largely disregard the laws of physics in the diffusion process and often generate…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Ye Yuan , Jiaming Song , Umar Iqbal , Arash Vahdat , Jan Kautz

Motion Generation from Fine-grained Textual Descriptions

The task of text2motion is to generate human motion sequences from given textual descriptions, where the model explores diverse mappings from natural language instructions to human body movements. While most existing works are confined to…

Artificial Intelligence · Computer Science 2024-03-27 Kunhang Li , Yansong Feng

Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model

Text-driven human motion generation in computer vision is both significant and challenging. However, current methods are limited to producing either deterministic or imprecise motion sequences, failing to effectively control the temporal…

Computer Vision and Pattern Recognition · Computer Science 2023-09-13 Yin Wang , Zhiying Leng , Frederick W. B. Li , Shun-Cheng Wu , Xiaohui Liang

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation

Text-to-motion generation requires not only grounding local actions in language but also seamlessly blending these individual actions to synthesize diverse and realistic global motions. However, existing motion generation methods primarily…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Peng Jin , Hao Li , Zesen Cheng , Kehan Li , Runyi Yu , Chang Liu , Xiangyang Ji , Li Yuan , Jie Chen

Strong and Controllable 3D Motion Generation

Human motion generation is a significant pursuit in generative computer vision with widespread applications in film-making, video games, AR/VR, and human-robot interaction. Current methods mainly utilize either diffusion-based generative…

Computer Vision and Pattern Recognition · Computer Science 2025-02-03 Canxuan Gang

DiffusionPhase: Motion Diffusion in Frequency Domain

In this study, we introduce a learning-based method for generating high-quality human motion sequences from text descriptions (e.g., ``A person walks forward"). Existing techniques struggle with motion diversity and smooth transitions in…

Computer Vision and Pattern Recognition · Computer Science 2023-12-08 Weilin Wan , Yiming Huang , Shutong Wu , Taku Komura , Wenping Wang , Dinesh Jayaraman , Lingjie Liu

Back to Basics: Motion Representation Matters for Human Motion Generation Using Diffusion Model

Diffusion models have emerged as a widely utilized and successful methodology in human motion synthesis. Task-oriented diffusion models have significantly advanced action-to-motion, text-to-motion, and audio-to-motion applications. In this…

Computer Vision and Pattern Recognition · Computer Science 2025-12-05 Yuduo Jin , Brandon Haworth

DiverseMotion: Towards Diverse Human Motion Generation via Discrete Diffusion

We present DiverseMotion, a new approach for synthesizing high-quality human motions conditioned on textual descriptions while preserving motion diversity.Despite the recent significant process in text-based human motion generation,existing…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Yunhong Lou , Linchao Zhu , Yaxiong Wang , Xiaohan Wang , Yi Yang

PackDiT: Joint Human Motion and Text Generation via Mutual Prompting

Human motion generation has advanced markedly with the advent of diffusion models. Most recent studies have concentrated on generating motion sequences based on text prompts, commonly referred to as text-to-motion generation. However, the…

Computer Vision and Pattern Recognition · Computer Science 2025-01-29 Zhongyu Jiang , Wenhao Chai , Zhuoran Zhou , Cheng-Yen Yang , Hsiang-Wei Huang , Jenq-Neng Hwang

MotionDiff: Training-free Zero-shot Interactive Motion Editing via Flow-assisted Multi-view Diffusion

Generative models have made remarkable advancements and are capable of producing high-quality content. However, performing controllable editing with generative models remains challenging, due to their inherent uncertainty in outputs. This…

Computer Vision and Pattern Recognition · Computer Science 2025-07-09 Yikun Ma , Yiqing Li , Jiawei Wu , Xing Luo , Zhi Jin

FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation

We present FloodDiffusion, a new framework for text-driven, streaming human motion generation. Given time-varying text prompts, FloodDiffusion generates text-aligned, seamless motion sequences with real-time latency. Unlike existing methods…

Computer Vision and Pattern Recognition · Computer Science 2026-02-09 Yiyi Cai , Yuhan Wu , Kunhang Li , You Zhou , Bo Zheng , Haiyang Liu