Related papers: Graph Based Temporal Aggregation for Video Retriev…

Use What You Have: Video Retrieval Using Representations From Collaborative Experts

The rapid growth of video on the internet has made searching for video content using natural language queries a significant challenge. Human-generated queries for video datasets `in the wild' vary a lot in terms of degree of specificity,…

Computer Vision and Pattern Recognition · Computer Science 2020-02-17 Yang Liu , Samuel Albanie , Arsha Nagrani , Andrew Zisserman

VRT: A Video Restoration Transformer

Video restoration (e.g., video super-resolution) aims to restore high-quality frames from low-quality frames. Different from single image restoration, video restoration generally requires to utilize temporal information from multiple…

Computer Vision and Pattern Recognition · Computer Science 2022-06-16 Jingyun Liang , Jiezhang Cao , Yuchen Fan , Kai Zhang , Rakesh Ranjan , Yawei Li , Radu Timofte , Luc Van Gool

Exploiting Visual Semantic Reasoning for Video-Text Retrieval

Video retrieval is a challenging research topic bridging the vision and language areas and has attracted broad attention in recent years. Previous works have been devoted to representing videos by directly encoding from frame-level…

Computer Vision and Pattern Recognition · Computer Science 2020-06-17 Zerun Feng , Zhimin Zeng , Caili Guo , Zheng Li

Towards Efficient and Robust Moment Retrieval System: A Unified Framework for Multi-Granularity Models and Temporal Reranking

Long-form video understanding presents significant challenges for interactive retrieval systems, as conventional methods struggle to process extensive video content efficiently. Existing approaches often rely on single models, inefficient…

Computer Vision and Pattern Recognition · Computer Science 2025-04-14 Huu-Loc Tran , Tinh-Anh Nguyen-Nhu , Huu-Phong Phan-Nguyen , Tien-Huy Nguyen , Nhat-Minh Nguyen-Dich , Anh Dao , Huy-Duc Do , Quan Nguyen , Hoang M. Le , Quang-Vinh Dinh

Large-Scale Query-by-Image Video Retrieval Using Bloom Filters

We consider the problem of using image queries to retrieve videos from a database. Our focus is on large-scale applications, where it is infeasible to index each database video frame independently. Our main contribution is a framework based…

Multimedia · Computer Science 2016-07-13 Andre Araujo , Jason Chaves , Haricharan Lakshman , Roland Angst , Bernd Girod

VVS: Video-to-Video Retrieval with Irrelevant Frame Suppression

In content-based video retrieval (CBVR), dealing with large-scale collections, efficiency is as important as accuracy; thus, several video-level feature-based studies have actively been conducted. Nevertheless, owing to the severe…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Won Jo , Geuntaek Lim , Gwangjin Lee , Hyunwoo Kim , Byungsoo Ko , Yukyung Choi

An Empirical Study of Frame Selection for Text-to-Video Retrieval

Text-to-video retrieval (TVR) aims to find the most relevant video in a large video gallery given a query text. The intricate and abundant context of the video challenges the performance and efficiency of TVR. To handle the serialized video…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Mengxia Wu , Min Cao , Yang Bai , Ziyin Zeng , Chen Chen , Liqiang Nie , Min Zhang

Interactive Video Retrieval with Dialog

Now that everyone can easily record videos, the quantity of which is continuously increasing, research on methods for improved video retrieval is important in the contemporary world. In cases where target videos are to be identified within…

Computer Vision and Pattern Recognition · Computer Science 2019-05-08 Sho Maeoki , Kohei Uehara , Tatsuya Harada

Self-supervised Video Retrieval Transformer Network

Content-based video retrieval aims to find videos from a large video database that are similar to or even near-duplicate of a given query video. Video representation and similarity search algorithms are crucial to any video retrieval…

Computer Vision and Pattern Recognition · Computer Science 2021-04-19 Xiangteng He , Yulin Pan , Mingqian Tang , Yiliang Lv

Cross-modal Embeddings for Video and Audio Retrieval

The increasing amount of online videos brings several opportunities for training self-supervised neural networks. The creation of large scale datasets of videos such as the YouTube-8M allows us to deal with this large amount of data in…

Information Retrieval · Computer Science 2018-01-09 Didac Surís , Amanda Duarte , Amaia Salvador , Jordi Torres , Xavier Giró-i-Nieto

Video Re-localization

Many methods have been developed to help people find the video contents they want efficiently. However, there are still some unsolved problems in this area. For example, given a query video and a reference video, how to accurately localize…

Computer Vision and Pattern Recognition · Computer Science 2018-08-07 Yang Feng , Lin Ma , Wei Liu , Tong Zhang , Jiebo Luo

Deep Learning for Video-Text Retrieval: a Review

Video-Text Retrieval (VTR) aims to search for the most relevant video related to the semantics in a given sentence, and vice versa. In general, this retrieval task is composed of four successive steps: video and textual feature…

Computer Vision and Pattern Recognition · Computer Science 2023-02-27 Cunjuan Zhu , Qi Jia , Wei Chen , Yanming Guo , Yu Liu

Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed Recognition

A dramatic increase in real-world video volume with extremely diverse and emerging topics naturally forms a long-tailed video distribution in terms of their categories, and it spotlights the need for Video Long-Tailed Recognition (VLTR). In…

Computer Vision and Pattern Recognition · Computer Science 2022-11-28 WonJun Moon , Hyun Seok Seong , Jae-Pil Heo

Enhancing Subsequent Video Retrieval via Vision-Language Models (VLMs)

The rapid growth of video content demands efficient and precise retrieval systems. While vision-language models (VLMs) excel in representation learning, they often struggle with adaptive, time-sensitive video retrieval. This paper…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Yicheng Duan , Xi Huang , Duo Chen

Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval

Cross-modal (e.g. image-text, video-text) retrieval is an important task in information retrieval and multimodal vision-language understanding field. Temporal understanding makes video-text retrieval more challenging than image-text…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Yang Du , Yuqi Liu , Qin Jin

Bidirectional Multirate Reconstruction for Temporal Modeling in Videos

Despite the recent success of neural networks in image feature learning, a major problem in the video domain is the lack of sufficient labeled data for learning to model temporal information. In this paper, we propose an unsupervised…

Computer Vision and Pattern Recognition · Computer Science 2016-11-29 Linchao Zhu , Zhongwen Xu , Yi Yang

CNN-VWII: An Efficient Approach for Large-Scale Video Retrieval by Image Queries

This paper aims to solve the problem of large-scale video retrieval by a query image. Firstly, we define the problem of top-$k$ image to video query. Then, we combine the merits of convolutional neural networks(CNN for short) and Bag of…

Multimedia · Computer Science 2018-10-16 Chengyuan Zhang , Yunwu Lin , Lei Zhu , Anfeng Liu , Zuping Zhang , Fang Huang

Video-adverb retrieval with compositional adverb-action embeddings

Retrieving adverbs that describe an action in a video poses a crucial step towards fine-grained video understanding. We propose a framework for video-to-adverb retrieval (and vice versa) that aligns video embeddings with their matching…

Computer Vision and Pattern Recognition · Computer Science 2023-09-27 Thomas Hummel , Otniel-Bogdan Mercea , A. Sophia Koepke , Zeynep Akata

Uncovering Hidden Challenges in Query-Based Video Moment Retrieval

The query-based moment retrieval is a problem of localising a specific clip from an untrimmed video according a query sentence. This is a challenging task that requires interpretation of both the natural language query and the video…

Computer Vision and Pattern Recognition · Computer Science 2020-10-08 Mayu Otani , Yuta Nakashima , Esa Rahtu , Janne Heikkilä

Video Super-resolution with Temporal Group Attention

Video super-resolution, which aims at producing a high-resolution video from its corresponding low-resolution version, has recently drawn increasing attention. In this work, we propose a novel method that can effectively incorporate…

Computer Vision and Pattern Recognition · Computer Science 2020-07-22 Takashi Isobe , Songjiang Li , Xu Jia , Shanxin Yuan , Gregory Slabaugh , Chunjing Xu , Ya-Li Li , Shengjin Wang , Qi Tian