Related papers: Meta Learning for Task-Driven Video Summarization

Progressive Video Summarization via Multimodal Self-supervised Learning

Modern video summarization methods are based on deep neural networks that require a large amount of annotated data for training. However, existing datasets for video summarization are small-scale, easily leading to over-fitting of the deep…

Computer Vision and Pattern Recognition · Computer Science 2022-10-20 Li Haopeng , Ke Qiuhong , Gong Mingming , Tom Drummond

Video Summarization Using Deep Neural Networks: A Survey

Video summarization technologies aim to create a concise and complete synopsis by selecting the most informative parts of the video content. Several approaches have been developed over the last couple of decades and the current state of the…

Computer Vision and Pattern Recognition · Computer Science 2021-09-28 Evlampios Apostolidis , Eleni Adamantidou , Alexandros I. Metsai , Vasileios Mezaris , Ioannis Patras

Personalized Video Summarization using Text-Based Queries and Conditional Modeling

The proliferation of video content on platforms like YouTube and Vimeo presents significant challenges in efficiently locating relevant information. Automatic video summarization aims to address this by extracting and presenting key content…

Computer Vision and Pattern Recognition · Computer Science 2024-08-28 Jia-Hong Huang

GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video Summarization

Traditional video summarization methods generate fixed video representations regardless of user interest. Therefore such methods limit users' expectations in content search and exploration scenarios. Multi-modal video summarization is one…

Computer Vision and Pattern Recognition · Computer Science 2021-04-27 Jia-Hong Huang , Luka Murn , Marta Mrak , Marcel Worring

Conditional Modeling Based Automatic Video Summarization

The aim of video summarization is to shorten videos automatically while retaining the key information necessary to convey the overall story. Video summarization methods mainly rely on visual factors, such as visual consecutiveness and…

Computer Vision and Pattern Recognition · Computer Science 2023-11-22 Jia-Hong Huang , Chao-Han Huck Yang , Pin-Yu Chen , Min-Hung Chen , Marcel Worring

V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning

Video summarization aims to create short, accurate, and cohesive summaries of longer videos. Despite the existence of various video summarization datasets, a notable limitation is their limited amount of source videos, which hampers the…

Computer Vision and Pattern Recognition · Computer Science 2025-10-09 Hang Hua , Yolo Yunlong Tang , Chenliang Xu , Jiebo Luo

TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency

YouTube users looking for instructions for a specific task may spend a long time browsing content trying to find the right video that matches their needs. Creating a visual summary (abridged version of a video) provides viewers with a quick…

Computer Vision and Pattern Recognition · Computer Science 2022-08-16 Medhini Narasimhan , Arsha Nagrani , Chen Sun , Michael Rubinstein , Trevor Darrell , Anna Rohrbach , Cordelia Schmid

Causal Video Summarizer for Video Exploration

Recently, video summarization has been proposed as a method to help video exploration. However, traditional video summarization models only generate a fixed video summary which is usually independent of user-specific needs and hence limits…

Computer Vision and Pattern Recognition · Computer Science 2023-07-06 Jia-Hong Huang , Chao-Han Huck Yang , Pin-Yu Chen , Andrew Brown , Marcel Worring

Transforming Multi-Concept Attention into Video Summarization

Video summarization is among challenging tasks in computer vision, which aims at identifying highlight frames or shots over a lengthy video input. In this paper, we propose an novel attention-based framework for video summarization with…

Computer Vision and Pattern Recognition · Computer Science 2020-06-04 Yen-Ting Liu , Yu-Jhe Li , Yu-Chiang Frank Wang

Multi-modal Summarization for Video-containing Documents

Summarization of multimedia data becomes increasingly significant as it is the basis for many real-world applications, such as question answering, Web search, and so forth. Most existing multi-modal summarization works however have used…

Computation and Language · Computer Science 2020-09-18 Xiyan Fu , Jun Wang , Zhenglu Yang

SD-VSum: A Method and Dataset for Script-Driven Video Summarization

In this work, we introduce the task of script-driven video summarization, which aims to produce a summary of the full-length video by selecting the parts that are most relevant to a user-provided script outlining the visual content of the…

Computer Vision and Pattern Recognition · Computer Science 2025-09-23 Manolis Mylonas , Evlampios Apostolidis , Vasileios Mezaris

SD-MVSum: Script-Driven Multimodal Video Summarization Method and Datasets

In this work, we present a method and two large-scale datasets for Script-Driven Multimodal Video Summarization. The proposed method, SD-MVSum, builds on our earlier SD-VSum method for script-driven video summarization, which considered…

Computer Vision and Pattern Recognition · Computer Science 2026-05-08 Manolis Mylonas , Charalampia Zerva , Evlampios Apostolidis , Vasileios Mezaris

Video Summarization Overview

With the broad growth of video capturing devices and applications on the web, it is more demanding to provide desired video content for users efficiently. Video summarization facilitates quickly grasping video content by creating a compact…

Computer Vision and Pattern Recognition · Computer Science 2022-10-24 Mayu Otani , Yale Song , Yang Wang

Video Summarization using Deep Semantic Features

This paper presents a video summarization technique for an Internet video to provide a quick way to overview its content. This is a challenging problem because finding important or informative parts of the original video requires to…

Computer Vision and Pattern Recognition · Computer Science 2016-09-29 Mayu Otani , Yuta Nakashima , Esa Rahtu , Janne Heikkilä , Naokazu Yokoya

Comprehensive Video Understanding: Video summarization with content-based video recommender design

Video summarization aims to extract keyframes/shots from a long video. Previous methods mainly take diversity and representativeness of generated summaries as prior knowledge in algorithm design. In this paper, we formulate video…

Computer Vision and Pattern Recognition · Computer Science 2019-10-31 Yudong Jiang , Kaixu Cui , Bo Peng , Changliang Xu

Video Summarization Using Fully Convolutional Sequence Networks

This paper addresses the problem of video summarization. Given an input video, the goal is to select a subset of the frames to create a summary video that optimally captures the important information of the input video. With the large…

Computer Vision and Pattern Recognition · Computer Science 2018-09-03 Mrigank Rochan , Linwei Ye , Yang Wang

Revisit Multimodal Meta-Learning through the Lens of Multi-Task Learning

Multimodal meta-learning is a recent problem that extends conventional few-shot meta-learning by generalizing its setup to diverse multimodal task distributions. This setup makes a step towards mimicking how humans make use of a diverse set…

Machine Learning · Computer Science 2021-10-28 Milad Abdollahzadeh , Touba Malekzadeh , Ngai-Man Cheung

Query-controllable Video Summarization

When video collections become huge, how to explore both within and across videos efficiently is challenging. Video summarization is one of the ways to tackle this issue. Traditional summarization approaches limit the effectiveness of video…

Information Retrieval · Computer Science 2020-04-09 Jia-Hong Huang , Marcel Worring

Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video

Current video summarization methods rely heavily on supervised computer vision techniques, which demands time-consuming and subjective manual annotations. To overcome these limitations, we investigated self-supervised video summarization.…

Computer Vision and Pattern Recognition · Computer Science 2024-08-21 Tomoya Sugihara , Shuntaro Masuda , Ling Xiao , Toshihiko Yamasaki

Video Summarization with Large Language Models

The exponential increase in video content poses significant challenges in terms of efficient navigation, search, and retrieval, thus requiring advanced video summarization techniques. Existing video summarization methods, which heavily rely…

Computer Vision and Pattern Recognition · Computer Science 2025-06-06 Min Jung Lee , Dayoung Gong , Minsu Cho