Related papers: Motion Inversion for Video Customization

Separate Motion from Appearance: Customizing Motion via Customizing Text-to-Video Diffusion Models

Motion customization aims to adapt the diffusion model (DM) to generate videos with the motion specified by a set of video clips with the same motion concept. To realize this goal, the adaptation of DM should be possible to model the…

Computer Vision and Pattern Recognition · Computer Science 2025-09-09 Huijie Liu , Jingyun Wang , Shuai Ma , Jie Hu , Xiaoming Wei , Guoliang Kang

SynMotion: Semantic-Visual Adaptation for Motion Customized Video Generation

Diffusion-based video motion customization facilitates the acquisition of human motion representations from a few video samples, while achieving arbitrary subjects transfer through precise textual conditioning. Existing approaches often…

Computer Vision and Pattern Recognition · Computer Science 2026-04-29 Shuai Tan , Biao Gong , Yujie Wei , Shiwei Zhang , Zhuoxin Liu , Ke Ma , Yan Wang , Kecheng Zheng , Xing Zhu , Yujun Shen , Hengshuang Zhao

The TIME Machine: On The Power of Motion for Efficient Perception

Video representation learning has seen tremendous progress in recent years. This has been driven by many factors, including the scale of training and the success of visual models trained contrastively with language. While these factors have…

Computer Vision and Pattern Recognition · Computer Science 2026-05-25 Mantas Skackauskas , Xinyue Hao , Laura Sevilla-Lara

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Text-to-video diffusion models have advanced video generation significantly. However, customizing these models to generate videos with tailored motions presents a substantial challenge. In specific, they encounter hurdles in (a) accurately…

Computer Vision and Pattern Recognition · Computer Science 2023-12-05 Hyeonho Jeong , Geon Yeong Park , Jong Chul Ye

Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models

Image customization has been extensively studied in text-to-image (T2I) diffusion models, leading to impressive outcomes and applications. With the emergence of text-to-video (T2V) diffusion models, its temporal counterpart, motion…

Computer Vision and Pattern Recognition · Computer Science 2024-08-29 Yixuan Ren , Yang Zhou , Jimei Yang , Jing Shi , Difan Liu , Feng Liu , Mingi Kwon , Abhinav Shrivastava

Developing Motion Code Embedding for Action Recognition in Videos

In this work, we propose a motion embedding strategy known as motion codes, which is a vectorized representation of motions based on a manipulation's salient mechanical attributes. These motion codes provide a robust motion representation,…

Computer Vision and Pattern Recognition · Computer Science 2021-08-18 Maxat Alibayev , David Paulius , Yu Sun

Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion

Recent years have seen a tremendous improvement in the quality of video generation and editing approaches. While several techniques focus on editing appearance, few address motion. Current approaches using text, trajectories, or bounding…

Computer Vision and Pattern Recognition · Computer Science 2025-05-27 Manuel Kansy , Jacek Naruniec , Christopher Schroers , Markus Gross , Romann M. Weber

NewMove: Customizing text-to-video models with novel motions

We introduce an approach for augmenting text-to-video generation models with customized motions, extending their capabilities beyond the motions depicted in the original training data. By leveraging a few video samples demonstrating…

Computer Vision and Pattern Recognition · Computer Science 2024-12-11 Joanna Materzynska , Josef Sivic , Eli Shechtman , Antonio Torralba , Richard Zhang , Bryan Russell

MotionAdapter: Video Motion Transfer via Content-Aware Attention Customization

Recent advances in diffusion-based text-to-video models, particularly those built on the diffusion transformer architecture, have achieved remarkable progress in generating high-quality and temporally coherent videos. However, transferring…

Computer Vision and Pattern Recognition · Computer Science 2026-04-08 Zhexin Zhang , Yangyang Xu , Yifeng Zhu , Long Chen , Yong Du , Shengfeng He , Jun Yu

Trajectory Attention for Fine-grained Video Motion Control

Recent advancements in video generation have been greatly driven by video diffusion models, with camera motion control emerging as a crucial challenge in creating view-customized visual content. This paper introduces trajectory attention, a…

Computer Vision and Pattern Recognition · Computer Science 2024-12-02 Zeqi Xiao , Wenqi Ouyang , Yifan Zhou , Shuai Yang , Lei Yang , Jianlou Si , Xingang Pan

Multi-Frame Content Integration with a Spatio-Temporal Attention Mechanism for Person Video Motion Transfer

Existing person video generation methods either lack the flexibility in controlling both the appearance and motion, or fail to preserve detailed appearance and temporal consistency. In this paper, we tackle the problem of motion transfer…

Computer Vision and Pattern Recognition · Computer Science 2019-08-13 Kun Cheng , Hao-Zhi Huang , Chun Yuan , Lingyiqing Zhou , Wei Liu

MotionDirector: Motion Customization of Text-to-Video Diffusion Models

Large-scale pre-trained diffusion models have exhibited remarkable capabilities in diverse video generations. Given a set of video clips of the same motion concept, the task of Motion Customization is to adapt existing text-to-video…

Computer Vision and Pattern Recognition · Computer Science 2023-10-13 Rui Zhao , Yuchao Gu , Jay Zhangjie Wu , David Junhao Zhang , Jiawei Liu , Weijia Wu , Jussi Keppo , Mike Zheng Shou

DreamVideo: Composing Your Dream Videos with Customized Subject and Motion

Customized generation using diffusion models has made impressive progress in image generation, but remains unsatisfactory in the challenging video generation task, as it requires the controllability of both subjects and motions. To that…

Computer Vision and Pattern Recognition · Computer Science 2023-12-08 Yujie Wei , Shiwei Zhang , Zhiwu Qing , Hangjie Yuan , Zhiheng Liu , Yu Liu , Yingya Zhang , Jingren Zhou , Hongming Shan

CoMo: Compositional Motion Customization for Text-to-Video Generation

While recent text-to-video models excel at generating diverse scenes, they struggle with precise motion control, particularly for complex, multi-subject motions. Although methods for single-motion customization have been developed to…

Computer Vision and Pattern Recognition · Computer Science 2025-10-28 Youcan Xu , Zhen Wang , Jiaxin Shi , Kexin Li , Feifei Shao , Jun Xiao , Yi Yang , Jun Yu , Long Chen

ReVideo: Remake a Video with Motion and Content Control

Despite significant advancements in video generation and editing using diffusion models, achieving accurate and localized video editing remains a substantial challenge. Additionally, most existing video editing methods primarily focus on…

Computer Vision and Pattern Recognition · Computer Science 2024-05-24 Chong Mou , Mingdeng Cao , Xintao Wang , Zhaoyang Zhang , Ying Shan , Jian Zhang

IM-Animation: An Implicit Motion Representation for Identity-decoupled Character Animation

Recent progress in video diffusion models has markedly advanced character animation, which synthesizes motioned videos by animating a static identity image according to a driving video. Explicit methods represent motion using skeleton,…

Computer Vision and Pattern Recognition · Computer Science 2026-02-10 Zhufeng Xu , Xuan Gao , Feng-Lin Liu , Haoxian Zhang , Zhixue Fang , Yu-Kun Lai , Xiaoqiang Liu , Pengfei Wan , Lin Gao

MotionMatcher: Motion Customization of Text-to-Video Diffusion Models via Motion Feature Matching

Text-to-video (T2V) diffusion models have shown promising capabilities in synthesizing realistic videos from input text prompts. However, the input text description alone provides limited control over the precise objects movements and…

Computer Vision and Pattern Recognition · Computer Science 2025-02-20 Yen-Siang Wu , Chi-Pin Huang , Fu-En Yang , Yu-Chiang Frank Wang

VideoBooth: Diffusion-based Video Generation with Image Prompts

Text-driven video generation witnesses rapid progress. However, merely using text prompts is not enough to depict the desired subject appearance that accurately aligns with users' intents, especially for customized content creation. In this…

Computer Vision and Pattern Recognition · Computer Science 2023-12-04 Yuming Jiang , Tianxing Wu , Shuai Yang , Chenyang Si , Dahua Lin , Yu Qiao , Chen Change Loy , Ziwei Liu

Scene Matters: Model-based Deep Video Compression

Video compression has always been a popular research area, where many traditional and deep video compression methods have been proposed. These methods typically rely on signal prediction theory to enhance compression performance by…

Computer Vision and Pattern Recognition · Computer Science 2023-08-31 Lv Tang , Xinfeng Zhang , Gai Zhang , Xiaoqi Ma

MotionFlow: Attention-Driven Motion Transfer in Video Diffusion Models

Text-to-video models have demonstrated impressive capabilities in producing diverse and captivating video content, showcasing a notable advancement in generative AI. However, these models generally lack fine-grained control over motion…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Tuna Han Salih Meral , Hidir Yesiltepe , Connor Dunlop , Pinar Yanardag