Related papers: Kinetic Typography Diffusion Model

Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

Text animation serves as an expressive medium, transforming static communication into dynamic experiences by infusing words with motion to evoke emotions, emphasize meanings, and construct compelling narratives. Crafting animations that are…

Computer Vision and Pattern Recognition · Computer Science 2024-11-06 Zichen Liu , Yihao Meng , Hao Ouyang , Yue Yu , Bolin Zhao , Daniel Cohen-Or , Huamin Qu

GlyphDiffusion: Text Generation as Image Generation

Diffusion models have become a new generative paradigm for text generation. Considering the discrete categorical nature of text, in this paper, we propose GlyphDiffusion, a novel diffusion approach for text generation via text-guided image…

Computation and Language · Computer Science 2023-05-09 Junyi Li , Wayne Xin Zhao , Jian-Yun Nie , Ji-Rong Wen

Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models

Traditional 3D content creation tools empower users to bring their imagination to life by giving them direct control over a scene's geometry, appearance, motion, and camera path. Creating computer-generated videos, however, is a tedious…

Computer Vision and Pattern Recognition · Computer Science 2023-12-05 Shengqu Cai , Duygu Ceylan , Matheus Gadelha , Chun-Hao Paul Huang , Tuanfeng Yang Wang , Gordon Wetzstein

The CLIP Model is Secretly an Image-to-Prompt Converter

The Stable Diffusion model is a prominent text-to-image generation model that relies on a text prompt as its input, which is encoded using the Contrastive Language-Image Pre-Training (CLIP). However, text prompts have limitations when it…

Computer Vision and Pattern Recognition · Computer Science 2024-02-16 Yuxuan Ding , Chunna Tian , Haoxuan Ding , Lingqiao Liu

Enhanced Fine-grained Motion Diffusion for Text-driven Human Motion Synthesis

The emergence of text-driven motion synthesis technique provides animators with great potential to create efficiently. However, in most cases, textual expressions only contain general and qualitative motion descriptions, while lack fine…

Computer Vision and Pattern Recognition · Computer Science 2023-12-27 Dong Wei , Xiaoning Sun , Huaijiang Sun , Bin Li , Shengxiang Hu , Weiqing Li , Jianfeng Lu

CustomText: Customized Textual Image Generation using Diffusion Models

Textual image generation spans diverse fields like advertising, education, product packaging, social media, information visualization, and branding. Despite recent strides in language-guided image synthesis using diffusion models, current…

Computer Vision and Pattern Recognition · Computer Science 2024-05-22 Shubham Paliwal , Arushi Jain , Monika Sharma , Vikram Jamwal , Lovekesh Vig

StableVideo: Text-driven Consistency-aware Diffusion Video Editing

Diffusion-based methods can generate realistic images and videos, but they struggle to edit existing objects in a video while preserving their appearance over time. This prevents diffusion models from being applied to natural video editing…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Wenhao Chai , Xun Guo , Gaoang Wang , Yan Lu

Diffusion Motion: Generate Text-Guided 3D Human Motion by Diffusion Model

We propose a simple and novel method for generating 3D human motion from complex natural language sentences, which describe different velocity, direction and composition of all kinds of actions. Different from existing methods that use…

Computer Vision and Pattern Recognition · Computer Science 2023-04-17 Zhiyuan Ren , Zhihong Pan , Xin Zhou , Le Kang

TokenFlow: Consistent Diffusion Features for Consistent Video Editing

The generative AI revolution has recently expanded to videos. Nevertheless, current state-of-the-art video models are still lagging behind image models in terms of visual quality and user control over the generated content. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2023-11-21 Michal Geyer , Omer Bar-Tal , Shai Bagon , Tali Dekel

Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models

Recent advances in diffusion models enable many powerful instruments for image editing. One of these instruments is text-driven image manipulations: editing semantic attributes of an image according to the provided text description. %…

Computer Vision and Pattern Recognition · Computer Science 2023-04-11 Nikita Starodubcev , Dmitry Baranchuk , Valentin Khrulkov , Artem Babenko

Pix2Video: Video Editing using Image Diffusion

Image diffusion models, trained on massive image collections, have emerged as the most versatile image generator model in terms of quality and diversity. They support inverting real images and conditional (e.g., text) generation, making…

Computer Vision and Pattern Recognition · Computer Science 2023-03-23 Duygu Ceylan , Chun-Hao Paul Huang , Niloy J. Mitra

Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion

Diffusion-based generative models' impressive ability to create convincing images has garnered global attention. However, their complex structures and operations often pose challenges for non-experts to grasp. We present Diffusion…

Computation and Language · Computer Science 2024-09-04 Seongmin Lee , Benjamin Hoover , Hendrik Strobelt , Zijie J. Wang , ShengYun Peng , Austin Wright , Kevin Li , Haekyu Park , Haoyang Yang , Duen Horng Chau

Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models

Diffusion models have achieved great progress in image animation due to powerful generative capabilities. However, maintaining spatio-temporal consistency with detailed information from the input static image over time (e.g., style,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-24 Xin Ma , Yaohui Wang , Gengyun Jia , Xinyuan Chen , Yuan-Fang Li , Cunjian Chen , Yu Qiao

TextDiffuser: Diffusion Models as Text Painters

Diffusion models have gained increasing attention for their impressive generation abilities but currently struggle with rendering accurate and coherent text. To address this issue, we introduce TextDiffuser, focusing on generating images…

Computer Vision and Pattern Recognition · Computer Science 2023-10-31 Jingye Chen , Yupan Huang , Tengchao Lv , Lei Cui , Qifeng Chen , Furu Wei

Animated Stickers: Bringing Stickers to Life with Video Diffusion

We introduce animated stickers, a video diffusion model which generates an animation conditioned on a text prompt and static sticker image. Our model is built on top of the state-of-the-art Emu text-to-image model, with the addition of…

Computer Vision and Pattern Recognition · Computer Science 2024-02-12 David Yan , Winnie Zhang , Luxin Zhang , Anmol Kalia , Dingkang Wang , Ankit Ramchandani , Miao Liu , Albert Pumarola , Edgar Schoenfeld , Elliot Blanchard , Krishna Narni , Yaqiao Luo , Lawrence Chen , Guan Pang , Ali Thabet , Peter Vajda , Amy Bearman , Licheng Yu

Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models

The imitation of cursive handwriting is mainly limited to generating handwritten words or lines. Multiple synthetic outputs must be stitched together to create paragraphs or whole pages, whereby consistency and layout information are lost.…

Computer Vision and Pattern Recognition · Computer Science 2024-09-04 Martin Mayr , Marcel Dreier , Florian Kordon , Mathias Seuret , Jochen Zöllner , Fei Wu , Andreas Maier , Vincent Christlein

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Text-guided diffusion models have shown superior performance in image/video generation and editing. While few explorations have been performed in 3D scenarios. In this paper, we discuss three fundamental and interesting problems on this…

Computer Vision and Pattern Recognition · Computer Science 2023-10-13 Gang Li , Heliang Zheng , Chaoyue Wang , Chang Li , Changwen Zheng , Dacheng Tao

Dreamix: Video Diffusion Models are General Video Editors

Text-driven image and video diffusion models have recently achieved unprecedented generation realism. While diffusion models have been successfully applied for image editing, very few works have done so for video editing. We present the…

Computer Vision and Pattern Recognition · Computer Science 2023-02-03 Eyal Molad , Eliahu Horwitz , Dani Valevski , Alex Rav Acha , Yossi Matias , Yael Pritch , Yaniv Leviathan , Yedid Hoshen

Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer

Large-scale text-to-video diffusion models have demonstrated an exceptional ability to synthesize diverse videos. However, due to the lack of extensive text-to-video datasets and the necessary computational resources for training, directly…

Computer Vision and Pattern Recognition · Computer Science 2023-05-10 Nisha Huang , Yuxin Zhang , Weiming Dong

Scribble-Guided Diffusion for Training-free Text-to-Image Generation

Recent advancements in text-to-image diffusion models have demonstrated remarkable success, yet they often struggle to fully capture the user's intent. Existing approaches using textual inputs combined with bounding boxes or region masks…

Computer Vision and Pattern Recognition · Computer Science 2024-09-13 Seonho Lee , Jiho Choi , Seohyun Lim , Jiwook Kim , Hyunjung Shim