Related papers: Video Re-localization

A Survey on Video Moment Localization

Video moment localization, also known as video moment retrieval, aiming to search a target segment within a video described by a given natural language query. Beyond the task of temporal action localization whereby the target actions are…

Computer Vision and Pattern Recognition · Computer Science 2023-06-14 Meng Liu , Liqiang Nie , Yunxiao Wang , Meng Wang , Yong Rui

Uncovering Hidden Challenges in Query-Based Video Moment Retrieval

The query-based moment retrieval is a problem of localising a specific clip from an untrimmed video according a query sentence. This is a challenging task that requires interpretation of both the natural language query and the video…

Computer Vision and Pattern Recognition · Computer Science 2020-10-08 Mayu Otani , Yuta Nakashima , Esa Rahtu , Janne Heikkilä

Semantic Video Moments Retrieval at Scale: A New Task and a Baseline

Motivated by the increasing need of saving search effort by obtaining relevant video clips instead of whole videos, we propose a new task, named Semantic Video Moments Retrieval at scale (SVMR), which aims at finding relevant videos coupled…

Computer Vision and Pattern Recognition · Computer Science 2022-10-18 Na Li

ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos

We address the problem of temporal localization of repetitive activities in a video, i.e., the problem of identifying all segments of a video that contain some sort of repetitive or periodic motion. To do so, the proposed method represents…

Computer Vision and Pattern Recognition · Computer Science 2019-10-15 Giorgos Karvounas , Iason Oikonomidis , Antonis Argyros

RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations

We present RELOCATE, a simple training-free baseline designed to perform the challenging task of visual query localization in long videos. To eliminate the need for task-specific training and efficiently handle long videos, RELOCATE…

Computer Vision and Pattern Recognition · Computer Science 2025-12-11 Savya Khosla , Sethuraman T , Alexander Schwing , Derek Hoiem

The 2023 Video Similarity Dataset and Challenge

This work introduces a dataset, benchmark, and challenge for the problem of video copy detection and localization. The problem comprises two distinct but related tasks: determining whether a query video shares content with a reference video…

Computer Vision and Pattern Recognition · Computer Science 2023-06-19 Ed Pizzi , Giorgos Kordopatis-Zilos , Hiral Patel , Gheorghe Postelnicu , Sugosh Nagavara Ravindra , Akshay Gupta , Symeon Papadopoulos , Giorgos Tolias , Matthijs Douze

Spatio-temporal Video Re-localization by Warp LSTM

The need for efficiently finding the video content a user wants is increasing because of the erupting of user-generated videos on the Web. Existing keyword-based or content-based video retrieval methods usually determine what occurs in a…

Computer Vision and Pattern Recognition · Computer Science 2019-05-13 Yang Feng , Lin Ma , Wei Liu , Jiebo Luo

Temporal Perceiving Video-Language Pre-training

Video-Language Pre-training models have recently significantly improved various multi-modal downstream tasks. Previous dominant works mainly adopt contrastive learning to achieve global feature alignment across modalities. However, the…

Computer Vision and Pattern Recognition · Computer Science 2023-01-19 Fan Ma , Xiaojie Jin , Heng Wang , Jingjia Huang , Linchao Zhu , Jiashi Feng , Yi Yang

CityGuessr: City-Level Video Geo-Localization on a Global Scale

Video geolocalization is a crucial problem in current times. Given just a video, ascertaining where it was captured from can have a plethora of advantages. The problem of worldwide geolocalization has been tackled before, but only using the…

Computer Vision and Pattern Recognition · Computer Science 2024-11-12 Parth Parag Kulkarni , Gaurav Kumar Nayak , Mubarak Shah

Content-Based Video Browsing by Text Region Localization and Classification

The amount of digital video data is increasing over the world. It highlights the need for efficient algorithms that can index, retrieve and browse this data by content. This can be achieved by identifying semantic description captured…

Multimedia · Computer Science 2013-01-11 Bassem Bouaziz , Walid Mahdi , Tarek Zlitni , Abdelmajid ben Hamadou

VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding

Long video understanding remains challenging for multimodal large language models (MLLMs) due to limited context windows, which necessitate identifying sparse query-relevant video segments. However, existing methods predominantly localize…

Computer Vision and Pattern Recognition · Computer Science 2026-05-04 Ruoliu Yang , Chu Wu , Caifeng Shan , Ran He , Chaoyou Fu

Query-adaptive Video Summarization via Quality-aware Relevance Estimation

Although the problem of automatic video summarization has recently received a lot of attention, the problem of creating a video summary that also highlights elements relevant to a search query has been less studied. We address this problem…

Computer Vision and Pattern Recognition · Computer Science 2017-09-29 Arun Balajee Vasudevan , Michael Gygli , Anna Volokitin , Luc Van Gool

Graph Neural Network for Video Relocalization

In this paper, we focus on video relocalization task, which uses a query video clip as input to retrieve a semantic relative video clip in another untrimmed long video. we find that in video relocalization datasets, there exists a…

Computer Vision and Pattern Recognition · Computer Science 2022-01-27 Yuan Zhou , Mingfei Wang , Ruolin Wang , Shuwei Huo

Deep Learning for Video-based Person Re-Identification: A Survey

Video-based person re-identification (video re-ID) has lately fascinated growing attention due to its broad practical applications in various areas, such as surveillance, smart city, and public safety. Nevertheless, video re-ID is quite…

Computer Vision and Pattern Recognition · Computer Science 2024-10-25 Khawar Islam

Tripping through time: Efficient Localization of Activities in Videos

Localizing moments in untrimmed videos via language queries is a new and interesting task that requires the ability to accurately ground language into video. Previous works have approached this task by processing the entire video, often…

Computer Vision and Pattern Recognition · Computer Science 2020-08-19 Meera Hahn , Asim Kadav , James M. Rehg , Hans Peter Graf

Video Summarization Using Fully Convolutional Sequence Networks

This paper addresses the problem of video summarization. Given an input video, the goal is to select a subset of the frames to create a summary video that optimally captures the important information of the input video. With the large…

Computer Vision and Pattern Recognition · Computer Science 2018-09-03 Mrigank Rochan , Linwei Ye , Yang Wang

HANet: Hierarchical Alignment Networks for Video-Text Retrieval

Video-text retrieval is an important yet challenging task in vision-language understanding, which aims to learn a joint embedding space where related video and text instances are close to each other. Most current works simply measure the…

Computer Vision and Pattern Recognition · Computer Science 2021-08-02 Peng Wu , Xiangteng He , Mingqian Tang , Yiliang Lv , Jing Liu

Retrieval and Localization with Observation Constraints

Accurate visual re-localization is very critical to many artificial intelligence applications, such as augmented reality, virtual reality, robotics and autonomous driving. To accomplish this task, we propose an integrated visual…

Computer Vision and Pattern Recognition · Computer Science 2021-08-20 Yuhao Zhou , Huanhuan Fan , Shuang Gao , Yuchen Yang , Xudong Zhang , Jijunnan Li , Yandong Guo

Locate before Answering: Answer Guided Question Localization for Video Question Answering

Video question answering (VideoQA) is an essential task in vision-language understanding, which has attracted numerous research attention recently. Nevertheless, existing works mostly achieve promising performances on short videos of…

Computer Vision and Pattern Recognition · Computer Science 2023-10-13 Tianwen Qian , Ran Cui , Jingjing Chen , Pai Peng , Xiaowei Guo , Yu-Gang Jiang

T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval

Text-video retrieval is a challenging task that aims to search relevant video contents based on natural language descriptions. The key to this problem is to measure text-video similarities in a joint embedding space. However, most existing…

Computer Vision and Pattern Recognition · Computer Science 2021-04-21 Xiaohan Wang , Linchao Zhu , Yi Yang