Related papers: Less is More: Improving Motion Diffusion Models wi…
The emergence of text-driven motion synthesis technique provides animators with great potential to create efficiently. However, in most cases, textual expressions only contain general and qualitative motion descriptions, while lack fine…
Motion in-betweening, a fundamental task in character animation, consists of generating motion sequences that plausibly interpolate user-provided keyframe constraints. It has long been recognized as a labor-intensive and challenging…
Denoising diffusion models have shown great promise in human motion synthesis conditioned on natural language descriptions. However, integrating spatial constraints, such as pre-defined motion trajectories and obstacles, remains a challenge…
Recent advancements in diffusion models have significantly improved the realism and generalizability of character-driven animation, enabling the synthesis of high-quality motion from just a single RGB image and a set of driving poses.…
The emergence of generative AI and controllable diffusion has made image-to-image synthesis increasingly practical and efficient. However, when input images exhibit low entropy and sparse, the inherent characteristics of diffusion models…
We present a method for generating video sequences with coherent motion between a pair of input key frames. We adapt a pretrained large-scale image-to-video diffusion model (originally trained to generate videos moving forward in time from…
Effective human behavior modeling is critical for successful human-robot interaction. Current state-of-the-art approaches for predicting listening head behavior during dyadic conversations employ continuous-to-discrete representations,…
Keyframes are a standard representation for kinematic motion specification. Recent learned motion-inbetweening methods use keyframes as a way to control generative motion models, and are trained to generate life-like motion that matches the…
Diffusion-based inpainting is a powerful tool for the reconstruction of images from sparse data. Its quality strongly depends on the choice of known data. Optimising their spatial location -- the inpainting mask -- is challenging. A…
Creating expressive character animations is labor-intensive, requiring intricate manual adjustment of animators across space and time. Previous works on controllable motion generation often rely on a predefined set of dense spatio-temporal…
Recent advances in generative models have yielded impressive progress on motion in-betweening, allowing for more complex, varied, and realistic motion transitions. However, recent methods still exhibit noticeable limitations in preserving…
Diffusion models that are based on iterative denoising have been recently proposed and leveraged in various generation tasks like image generation. Whereas, as a way inherently built for continuous data, existing diffusion models still have…
Controllable generation of 3D human motions becomes an important topic as the world embraces digital transformation. Existing works, though making promising progress with the advent of diffusion models, heavily rely on meticulously captured…
Prior masked modeling motion generation methods predominantly study text-to-motion. We present DiMo, a discrete diffusion-style framework, which extends masked modeling to bidirectional text--motion understanding and generation. Unlike…
Portrait animation aims to generate photo-realistic videos from a single source image by reenacting the expression and pose from a driving video. While early methods relied on 3D morphable models or feature warping techniques, they often…
Video generation using diffusion-based models is constrained by high computational costs due to the frame-wise iterative diffusion process. This work presents a Diffusion Reuse MOtion (Dr. Mo) network to accelerate latent video generation.…
Generative models, particularly diffusion models, have made significant success in data synthesis across various modalities, including images, videos, and 3D assets. However, current diffusion models are computationally intensive, often…
For bandwidth-constrained multimedia applications, simultaneously achieving ultra-low bitrate human video compression and accurate vertex prediction remains a critical challenge, as it demands the harmonization of dynamic motion modeling,…
Diffusion models have shown remarkable success across a wide range of generative tasks. However, they often suffer from spatially inconsistent generation, arguably due to the inherent locality of their denoising mechanisms. This can yield…
Interactive motion synthesis is essential in creating immersive experiences in entertainment applications, such as video games and virtual reality. However, generating animations that are both high-quality and contextually responsive…