English
Related papers

Related papers: Cinematographic Camera Diffusion Model

200 papers

Modern text-to-video synthesis models demonstrate coherent, photorealistic generation of complex videos from a text description. However, most existing models lack fine-grained control over camera movement, which is critical for downstream…

Traditional 3D content creation tools empower users to bring their imagination to life by giving them direct control over a scene's geometry, appearance, motion, and camera path. Creating computer-generated videos, however, is a tedious…

Computer Vision and Pattern Recognition · Computer Science 2023-12-05 Shengqu Cai , Duygu Ceylan , Matheus Gadelha , Chun-Hao Paul Huang , Tuanfeng Yang Wang , Gordon Wetzstein

State-of-the-art diffusion models can generate highly realistic images based on various conditioning like text, segmentation, and depth. However, an essential aspect often overlooked is the specific camera geometry used during image…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Andrey Voynov , Amir Hertz , Moab Arar , Shlomi Fruchter , Daniel Cohen-Or

Recent advancements in 3D generation have leveraged synthetic datasets with ground truth 3D assets and predefined cameras. However, the potential of adopting real-world datasets, which can produce significantly more realistic 3D scenes,…

Computer Vision and Pattern Recognition · Computer Science 2024-06-26 Xinyang Li , Zhangyu Lai , Linning Xu , Yansong Qu , Liujuan Cao , Shengchuan Zhang , Bo Dai , Rongrong Ji

Numerous works have recently integrated 3D camera control into foundational text-to-video models, but the resulting camera control is often imprecise, and video generation quality suffers. In this work, we analyze camera motion from a first…

Computer Vision and Pattern Recognition · Computer Science 2025-05-07 Sherwin Bahmani , Ivan Skorokhodov , Guocheng Qian , Aliaksandr Siarohin , Willi Menapace , Andrea Tagliasacchi , David B. Lindell , Sergey Tulyakov

Diffusion models are able to generate photorealistic images in arbitrary scenes. However, when applying diffusion models to image translation, there exists a trade-off between maintaining spatial structure and high-quality content. Besides,…

Computer Vision and Pattern Recognition · Computer Science 2023-02-07 Shiqi Sun , Shancheng Fang , Qian He , Wei Liu

Diffusion-based methods can generate realistic images and videos, but they struggle to edit existing objects in a video while preserving their appearance over time. This prevents diffusion models from being applied to natural video editing…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Wenhao Chai , Xun Guo , Gaoang Wang , Yan Lu

Long-range human movement generation remains a central challenge in computer vision and graphics. Generating coherent transitions across semantically distinct motion domains remains largely unexplored. This capability is particularly…

Computer Vision and Pattern Recognition · Computer Science 2026-04-07 Haichao Wang , Alexander Okupnik , Yuxing Han , Gene Wen , Johannes Schneider , Kyriakos Flouris

Diffusion models have achieved great progress in image animation due to powerful generative capabilities. However, maintaining spatio-temporal consistency with detailed information from the input static image over time (e.g., style,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-24 Xin Ma , Yaohui Wang , Gengyun Jia , Xinyuan Chen , Yuan-Fang Li , Cunjian Chen , Yu Qiao

User-generated cinematic creations are gaining popularity as our daily entertainment, yet it is a challenge to master cinematography for producing immersive contents. Many existing automatic methods focus on roughly controlling predefined…

Multimedia · Computer Science 2024-05-24 Xinyi Wu , Haohong Wang , Aggelos K. Katsaggelos

We present a novel character control framework that effectively utilizes motion diffusion probabilistic models to generate high-quality and diverse character animations, responding in real-time to a variety of dynamic user-supplied control…

Graphics · Computer Science 2024-04-24 Rui Chen , Mingyi Shi , Shaoli Huang , Ping Tan , Taku Komura , Xuelin Chen

Text-to-video generation aims to produce a video based on a given prompt. Recently, several commercial video models have been able to generate plausible videos with minimal noise, excellent details, and high aesthetic scores. However, these…

Computer Vision and Pattern Recognition · Computer Science 2024-01-18 Haoxin Chen , Yong Zhang , Xiaodong Cun , Menghan Xia , Xintao Wang , Chao Weng , Ying Shan

Large text-to-image diffusion models have exhibited impressive proficiency in generating high-quality images. However, when applying these models to video domain, ensuring temporal consistency across video frames remains a formidable…

Computer Vision and Pattern Recognition · Computer Science 2023-09-19 Shuai Yang , Yifan Zhou , Ziwei Liu , Chen Change Loy

Portrait animation aims to generate photo-realistic videos from a single source image by reenacting the expression and pose from a driving video. While early methods relied on 3D morphable models or feature warping techniques, they often…

Computer Vision and Pattern Recognition · Computer Science 2025-09-23 Mallikarjun B. R. , Fei Yin , Vikram Voleti , Nikita Drobyshev , Maksim Lapin , Aaryaman Vasishta , Varun Jampani

Recent advances in text-to-image generation with diffusion models present transformative capabilities in image quality. However, user controllability of the generated image, and fast adaptation to new tasks still remains an open challenge,…

Computer Vision and Pattern Recognition · Computer Science 2023-02-17 Omer Bar-Tal , Lior Yariv , Yaron Lipman , Tali Dekel

Image diffusion models are trained on independently sampled static images. While this is the bedrock task protocol in generative modeling, capturing the temporal world through the lens of static snapshots is information-deficient by design.…

Computer Vision and Pattern Recognition · Computer Science 2025-09-05 Juhun Lee , Simon S. Woo

Human motion generation is a significant pursuit in generative computer vision with widespread applications in film-making, video games, AR/VR, and human-robot interaction. Current methods mainly utilize either diffusion-based generative…

Computer Vision and Pattern Recognition · Computer Science 2025-02-03 Canxuan Gang

Camera-controllable video generation aims to synthesize videos with flexible and physically plausible camera movements. However, existing methods either provide imprecise camera control from text prompts or rely on labor-intensive manual…

Computer Vision and Pattern Recognition · Computer Science 2026-04-13 Haoyu Zhao , Zihao Zhang , Jiaxi Gu , Haoran Chen , Qingping Zheng , Pin Tang , Yeyin Jin , Yuang Zhang , Junqi Cheng , Zenghui Lu , Peng Shu , Zuxuan Wu , Yu-Gang Jiang

While modern diffusion models excel at generating high-quality and diverse images, they still struggle with high-fidelity compositional and multimodal control, particularly when users simultaneously specify text prompts, subject references,…

Computer Vision and Pattern Recognition · Computer Science 2025-11-27 Yusuf Dalva , Guocheng Gordon Qian , Maya Goldenberg , Tsai-Shien Chen , Kfir Aberman , Sergey Tulyakov , Pinar Yanardag , Kuan-Chieh Jackson Wang

Text-to-video models have demonstrated impressive capabilities in producing diverse and captivating video content, showcasing a notable advancement in generative AI. However, these models generally lack fine-grained control over motion…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Tuna Han Salih Meral , Hidir Yesiltepe , Connor Dunlop , Pinar Yanardag
‹ Prev 1 2 3 10 Next ›