Related papers: Neutral Editing Framework for Diffusion-based Vide…

Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing

Text-driven video editing utilizing generative diffusion models has garnered significant attention due to their potential applications. However, existing approaches are constrained by the limited word embeddings provided in pre-training,…

Computer Vision and Pattern Recognition · Computer Science 2025-05-28 Mingce Guo , Jingxuan He , Shengeng Tang , Zhangye Wang , Lechao Cheng

Neural Video Fields Editing

Diffusion models have revolutionized text-driven video editing. However, applying these methods to real-world editing encounters two significant challenges: (1) the rapid increase in GPU memory demand as the number of frames grows, and (2)…

Computer Vision and Pattern Recognition · Computer Science 2024-03-12 Shuzhou Yang , Chong Mou , Jiwen Yu , Yuhan Wang , Xiandong Meng , Jian Zhang

StableVideo: Text-driven Consistency-aware Diffusion Video Editing

Diffusion-based methods can generate realistic images and videos, but they struggle to edit existing objects in a video while preserving their appearance over time. This prevents diffusion models from being applied to natural video editing…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Wenhao Chai , Xun Guo , Gaoang Wang , Yan Lu

UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing

Recent advances in text-guided video editing have showcased promising results in appearance editing (e.g., stylization). However, video motion editing in the temporal dimension (e.g., from eating to waving), which distinguishes video…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Jianhong Bai , Tianyu He , Yuchi Wang , Junliang Guo , Haoji Hu , Zuozhu Liu , Jiang Bian

DNI: Dilutional Noise Initialization for Diffusion Video Editing

Text-based diffusion video editing systems have been successful in performing edits with high fidelity and textual alignment. However, this success is limited to rigid-type editing such as style transfer and object overlay, while preserving…

Computer Vision and Pattern Recognition · Computer Science 2024-09-23 Sunjae Yoon , Gwanhyeong Koo , Ji Woo Hong , Chang D. Yoo

Versatile Editing of Video Content, Actions, and Dynamics without Training

Controlled video generation has seen drastic improvements in recent years. However, editing actions and dynamic events, or inserting contents that should affect the behaviors of other objects in real-world videos, remains a major challenge.…

Computer Vision and Pattern Recognition · Computer Science 2026-03-19 Vladimir Kulikov , Roni Paiss , Andrey Voynov , Inbar Mosseri , Tali Dekel , Tomer Michaeli

DriveEditor: A Unified 3D Information-Guided Framework for Controllable Object Editing in Driving Scenes

Vision-centric autonomous driving systems require diverse data for robust training and evaluation, which can be augmented by manipulating object positions and appearances within existing scene captures. While recent advancements in…

Computer Vision and Pattern Recognition · Computer Science 2024-12-31 Yiyuan Liang , Zhiying Yan , Liqun Chen , Jiahuan Zhou , Luxin Yan , Sheng Zhong , Xu Zou

Towards a Training Free Approach for 3D Scene Editing

Text driven diffusion models have shown remarkable capabilities in editing images. However, when editing 3D scenes, existing works mostly rely on training a NeRF for 3D editing. Recent NeRF editing methods leverages edit operations by…

Computer Vision and Pattern Recognition · Computer Science 2024-12-18 Vivek Madhavaram , Shivangana Rawat , Chaitanya Devaguptapu , Charu Sharma , Manohar Kaul

Drag-A-Video: Non-rigid Video Editing with Point-based Interaction

Video editing is a challenging task that requires manipulating videos on both the spatial and temporal dimensions. Existing methods for video editing mainly focus on changing the appearance or style of the objects in the video, while…

Computer Vision and Pattern Recognition · Computer Science 2023-12-06 Yao Teng , Enze Xie , Yue Wu , Haoyu Han , Zhenguo Li , Xihui Liu

InFusion: Inject and Attention Fusion for Multi Concept Zero-Shot Text-based Video Editing

Large text-to-image diffusion models have achieved remarkable success in generating diverse, high-quality images. Additionally, these models have been successfully leveraged to edit input images by just changing the text prompt. But when…

Computer Vision and Pattern Recognition · Computer Science 2023-08-11 Anant Khandelwal

Diffusion Model-Based Image Editing: A Survey

Denoising diffusion models have emerged as a powerful tool for various image generation and editing tasks, facilitating the synthesis of visual content in an unconditional or input-conditional manner. The core idea behind them is learning…

Computer Vision and Pattern Recognition · Computer Science 2025-03-12 Yi Huang , Jiancheng Huang , Yifan Liu , Mingfu Yan , Jiaxi Lv , Jianzhuang Liu , Wei Xiong , He Zhang , Liangliang Cao , Shifeng Chen

I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models

The remarkable generative capabilities of diffusion models have motivated extensive research in both image and video editing. Compared to video editing which faces additional challenges in the time dimension, image editing has witnessed the…

Computer Vision and Pattern Recognition · Computer Science 2024-05-28 Wenqi Ouyang , Yi Dong , Lei Yang , Jianlou Si , Xingang Pan

Pix2Video: Video Editing using Image Diffusion

Image diffusion models, trained on massive image collections, have emerged as the most versatile image generator model in terms of quality and diversity. They support inverting real images and conditional (e.g., text) generation, making…

Computer Vision and Pattern Recognition · Computer Science 2023-03-23 Duygu Ceylan , Chun-Hao Paul Huang , Niloy J. Mitra

Dreamix: Video Diffusion Models are General Video Editors

Text-driven image and video diffusion models have recently achieved unprecedented generation realism. While diffusion models have been successfully applied for image editing, very few works have done so for video editing. We present the…

Computer Vision and Pattern Recognition · Computer Science 2023-02-03 Eyal Molad , Eliahu Horwitz , Dani Valevski , Alex Rav Acha , Yossi Matias , Yael Pritch , Yaniv Leviathan , Yedid Hoshen

Audio-Guided Visual Editing with Complex Multi-Modal Prompts

Visual editing with diffusion models has made significant progress but often struggles with complex scenarios that textual guidance alone could not adequately describe, highlighting the need for additional non-text editing prompts. In this…

Computer Vision and Pattern Recognition · Computer Science 2025-08-29 Hyeonyu Kim , Seokhoon Jeong , Seonghee Han , Chanhyuk Choi , Taehwan Kim

TDEdit: A Unified Diffusion Framework for Text-Drag Guided Image Manipulation

This paper explores image editing under the joint control of text and drag interactions. While recent advances in text-driven and drag-driven editing have achieved remarkable progress, they suffer from complementary limitations: text-driven…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Qihang Wang , Yaxiong Wang , Lechao Cheng , Zhun Zhong

Unified Diffusion-Based Rigid and Non-Rigid Editing with Text and Image Guidance

Existing text-to-image editing methods tend to excel either in rigid or non-rigid editing but encounter challenges when combining both, resulting in misaligned outputs with the provided text prompts. In addition, integrating reference…

Computer Vision and Pattern Recognition · Computer Science 2024-01-05 Jiacheng Wang , Ping Liu , Wei Xu

DiffMagicFace: Identity Consistent Facial Editing of Real Videos

Text-conditioned image editing has greatly benefitted from the advancements in Image Diffusion Models. However, extending these techniques to facial video editing introduces challenges in preserving facial identity throughout the source…

Computer Vision and Pattern Recognition · Computer Science 2026-04-16 Huanghao Yin , Shenkun Xu , Kanle Shi , Junhai Yong , Bin Wang

Differential Diffusion: Giving Each Pixel Its Strength

Diffusion models have revolutionized image generation and editing, producing state-of-the-art results in conditioned and unconditioned image synthesis. While current techniques enable user control over the degree of change in an image edit,…

Computer Vision and Pattern Recognition · Computer Science 2024-03-01 Eran Levin , Ohad Fried

PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor

Generative image editing has recently witnessed extremely fast-paced growth. Some works use high-level conditioning such as text, while others use low-level conditioning. Nevertheless, most of them lack fine-grained control over the…

Computer Vision and Pattern Recognition · Computer Science 2024-04-10 Vidit Goel , Elia Peruzzo , Yifan Jiang , Dejia Xu , Xingqian Xu , Nicu Sebe , Trevor Darrell , Zhangyang Wang , Humphrey Shi