Related papers: Towards Data-Driven Automatic Video Editing
This paper presents a novel framework termed Cut-and-Paste for real-word semantic video editing under the guidance of text prompt and additional reference image. While the text-driven video editing has demonstrated remarkable ability to…
We propose an automatic approach that extracts editing styles in a source video and applies the edits to matched footage for video creation. Our Computer Vision based techniques considers framing, content type, playback speed, and lighting…
Video content creation keeps growing at an incredible pace; yet, creating engaging stories remains challenging and requires non-trivial video editing expertise. Many video editing components are astonishingly hard to automate primarily due…
Video editing can be a very tedious task, so unsurprisingly Artificial Intelligence has been increasingly used to streamline the workflow or automate away tedious tasks. However, it is very difficult to get an overview of what intelligent…
Machine learning is transforming the video editing industry. Recent advances in computer vision have leveled-up video editing tasks such as intelligent reframing, rotoscoping, color grading, or applying digital makeups. However, most of the…
This paper presents a novel retrieval pipeline for video collections, which aims to retrieve the most significant parts of an edited video for a given query, and represent them with thumbnails which are at the same time semantically…
As information becomes more accessible, user-generated videos are increasing in length, placing a burden on viewers to sift through vast content for valuable insights. This trend underscores the need for an algorithm to extract key video…
Automated tools for video editing and assembly have applications ranging from filmmaking and advertisement to content creation for social media. Previous video editing work has mainly focused on either retrieval or user interfaces, leaving…
Creators struggle to edit long-form, narrative-rich videos not because of UI complexity, but due to the cognitive demands of searching, storyboarding, and sequencing hours of footage. Existing transcript- or embedding-based methods fall…
Fine-tuning large-scale text-to-video diffusion models to add new generative controls, such as those over physical camera parameters (e.g., shutter speed or aperture), typically requires vast, high-fidelity datasets that are difficult to…
In this era of videos, automatic video editing techniques attract more and more attention from industry and academia since they can reduce workloads and lower the requirements for human editors. Existing automatic editing systems are mainly…
Modern visual effects (VFX) software has made it possible for skilled artists to create imagery of virtually anything. However, the creation process remains laborious, complex, and largely inaccessible to everyday users. In this work, we…
Denoising is a crucial step in many video processing pipelines such as in interactive editing, where high quality, speed, and user control are essential. While recent approaches achieve significant improvements in denoising quality by…
Computational Video Editing Systems output video generally follows a particular form, e.g. conversation or music videos, in this way they are domain specific. We describe a recent development in our video annotation and segmentation system…
We present a method for zero-shot, text-driven appearance manipulation in natural images and videos. Given an input image or video and a target text prompt, our goal is to edit the appearance of existing objects (e.g., object's texture) or…
We introduce a novel and efficient approach for text-based video-to-video editing that eliminates the need for resource-intensive per-video-per-model finetuning. At the core of our approach is a synthetic paired video dataset tailored for…
Video is a powerful medium for communication and storytelling, yet reauthoring existing footage remains challenging. Even simple edits often demand expertise, time, and careful planning, constraining how creators envision and shape their…
For the last decades, the concern of producing convincing facial animation has garnered great interest, that has only been accelerating with the recent explosion of 3D content in both entertainment and professional activities. The use of…
Cinematic storytelling is profoundly shaped by the artful manipulation of photographic elements such as depth of field and exposure. These effects are crucial in conveying mood and creating aesthetic appeal. However, controlling these…
Temporal consistency is essential for video editing applications. Existing work on layered representation of videos allows propagating edits consistently to each frame. These methods, however, can only edit object appearance rather than…