Related papers: Multi-task Learning with Extended Temporal Shift M…

Action Recognition Using Temporal Shift Module and Ensemble Learning

This paper presents the first-rank solution for the Multi-Modal Action Recognition Challenge, part of the Multi-Modal Visual Pattern Recognition Workshop at the \acl{ICPR} 2024. The competition aimed to recognize human actions using a…

Computer Vision and Pattern Recognition · Computer Science 2025-02-11 Anh-Kiet Duong , Petra Gomez-Krämer

The Solution for Temporal Action Localisation Task of Perception Test Challenge 2024

This report presents our method for Temporal Action Localisation (TAL), which focuses on identifying and classifying actions within specific time intervals throughout a video sequence. We employ a data augmentation technique by expanding…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Yinan Han , Qingyuan Jiang , Hongming Mei , Yang Yang , Jinhui Tang

Transferable Knowledge-Based Multi-Granularity Aggregation Network for Temporal Action Localization: Submission to ActivityNet Challenge 2021

This technical report presents an overview of our solution used in the submission to 2021 HACS Temporal Action Localization Challenge on both Supervised Learning Track and Weakly-Supervised Learning Track. Temporal Action Localization (TAL)…

Computer Vision and Pattern Recognition · Computer Science 2021-07-28 Haisheng Su , Peiqin Zhuang , Yukun Li , Dongliang Wang , Weihao Gan , Wei Wu , Yu Qiao

Temporal Action Localization with Cross Layer Task Decoupling and Refinement

Temporal action localization (TAL) involves dual tasks to classify and localize actions within untrimmed videos. However, the two tasks often have conflicting requirements for features. Existing methods typically employ separate heads for…

Computer Vision and Pattern Recognition · Computer Science 2024-12-16 Qiang Li , Di Liu , Jun Kong , Sen Li , Hui Xu , Jianzhong Wang

Visual Self-paced Iterative Learning for Unsupervised Temporal Action Localization

Recently, temporal action localization (TAL) has garnered significant interest in information retrieval community. However, existing supervised/weakly supervised methods are heavily dependent on extensive labeled temporal boundaries and…

Computer Vision and Pattern Recognition · Computer Science 2025-04-01 Yupeng Hu , Han Jiang , Hao Liu , Kun Wang , Haoyu Tang , Liqiang Nie

Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent Mechanism

Temporal Action Localization (TAL) is a critical task in video analysis, identifying precise start and end times of actions. Existing methods like CNNs, RNNs, GCNs, and Transformers have limitations in capturing long-range dependencies and…

Computer Vision and Pattern Recognition · Computer Science 2024-07-19 Sangyoun Lee , Juho Jung , Changdae Oh , Sunghee Yun

Enriching Local and Global Contexts for Temporal Action Localization

Effectively tackling the problem of temporal action localization (TAL) necessitates a visual representation that jointly pursues two confounding goals, i.e., fine-grained discrimination for temporal localization and sufficient visual…

Computer Vision and Pattern Recognition · Computer Science 2021-08-10 Zixin Zhu , Wei Tang , Le Wang , Nanning Zheng , Gang Hua

Background-Click Supervision for Temporal Action Localization

Weakly supervised temporal action localization aims at learning the instance-level action pattern from the video-level labels, where a significant challenge is action-context confusion. To overcome this challenge, one recent work builds an…

Computer Vision and Pattern Recognition · Computer Science 2021-11-25 Le Yang , Junwei Han , Tao Zhao , Tianwei Lin , Dingwen Zhang , Jianxin Chen

CLIP-AE: CLIP-assisted Cross-view Audio-Visual Enhancement for Unsupervised Temporal Action Localization

Temporal Action Localization (TAL) has garnered significant attention in information retrieval. Existing supervised or weakly supervised methods heavily rely on labeled temporal boundaries and action categories, which are labor-intensive…

Computer Vision and Pattern Recognition · Computer Science 2025-06-06 Rui Xia , Dan Jiang , Quan Zhang , Ke Zhang , Chun Yuan

Active Learning with Effective Scoring Functions for Semi-Supervised Temporal Action Localization

Temporal Action Localization (TAL) aims to predict both action category and temporal boundary of action instances in untrimmed videos, i.e., start and end time. Fully-supervised solutions are usually adopted in most existing works, and…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 Ding Li , Xuebing Yang , Yongqiang Tang , Chenyang Zhang , Wensheng Zhang

Unsupervised Pre-training for Temporal Action Localization Tasks

Unsupervised video representation learning has made remarkable achievements in recent years. However, most existing methods are designed and optimized for video classification. These pre-trained models can be sub-optimal for temporal…

Computer Vision and Pattern Recognition · Computer Science 2022-03-28 Can Zhang , Tianyu Yang , Junwu Weng , Meng Cao , Jue Wang , Yuexian Zou

Weakly Supervised Temporal Action Localization Through Learning Explicit Subspaces for Action and Context

Weakly-supervised Temporal Action Localization (WS-TAL) methods learn to localize temporal starts and ends of action instances in a video under only video-level supervision. Existing WS-TAL methods rely on deep features learned for action…

Computer Vision and Pattern Recognition · Computer Science 2021-03-31 Ziyi Liu , Le Wang , Wei Tang , Junsong Yuan , Nanning Zheng , Gang Hua

Weakly-Supervised Temporal Action Localization Through Local-Global Background Modeling

Weakly-Supervised Temporal Action Localization (WS-TAL) task aims to recognize and localize temporal starts and ends of action instances in an untrimmed video with only video-level label supervision. Due to lack of negative samples of…

Computer Vision and Pattern Recognition · Computer Science 2021-06-23 Xiang Wang , Zhiwu Qing , Ziyuan Huang , Yutong Feng , Shiwei Zhang , Jianwen Jiang , Mingqian Tang , Yuanjie Shao , Nong Sang

Bottom-Up Temporal Action Localization with Mutual Regularization

Recently, temporal action localization (TAL), i.e., finding specific action segments in untrimmed videos, has attracted increasing attentions of the computer vision community. State-of-the-art solutions for TAL involves evaluating the…

Computer Vision and Pattern Recognition · Computer Science 2021-03-01 Peisen Zhao , Lingxi Xie , Chen Ju , Ya Zhang , Yanfeng Wang , Qi Tian

Exploring the Temporal Consistency for Point-Level Weakly-Supervised Temporal Action Localization

Point-supervised Temporal Action Localization (PTAL) adopts a lightly frame-annotated paradigm (\textit{i.e.}, labeling only a single frame per action instance) to train a model to effectively locate action instances within untrimmed…

Computer Vision and Pattern Recognition · Computer Science 2026-02-06 Yunchuan Ma , Laiyun Qing , Guorong Li , Yuqing Liu , Yuankai Qi , Qingming Huang

Online Temporal Action Localization with Memory-Augmented Transformer

Online temporal action localization (On-TAL) is the task of identifying multiple action instances given a streaming video. Since existing methods take as input only a video segment of fixed size per iteration, they are limited in…

Computer Vision and Pattern Recognition · Computer Science 2024-08-07 Youngkil Song , Dongkeun Kim , Minsu Cho , Suha Kwak

Action Sensitivity Learning for Temporal Action Localization

Temporal action localization (TAL), which involves recognizing and locating action instances, is a challenging task in video understanding. Most existing approaches directly predict action classes and regress offsets to boundaries, while…

Computer Vision and Pattern Recognition · Computer Science 2023-09-14 Jiayi Shao , Xiaohan Wang , Ruijie Quan , Junjun Zheng , Jiang Yang , Yi Yang

Chain-of-Evidence Multimodal Reasoning for Few-shot Temporal Action Localization

Traditional temporal action localization (TAL) methods rely on large amounts of detailed annotated data, whereas few-shot TAL reduces this dependence by using only a few training samples to identify unseen action categories. However,…

Computer Vision and Pattern Recognition · Computer Science 2025-12-29 Mengshi Qi , Hongwei Ji , Wulian Yun , Xianlin Zhang , Huadong Ma

Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action Localization

Temporal action localization (TAL) is a fundamental yet challenging task in video understanding. Existing TAL methods rely on pre-training a video encoder through action classification supervision. This results in a task discrepancy problem…

Computer Vision and Pattern Recognition · Computer Science 2021-11-01 Mengmeng Xu , Juan-Manuel Perez-Rua , Xiatian Zhu , Bernard Ghanem , Brais Martinez

Proposal-based Temporal Action Localization with Point-level Supervision

Point-level supervised temporal action localization (PTAL) aims at recognizing and localizing actions in untrimmed videos where only a single point (frame) within every action instance is annotated in training data. Without temporal…

Computer Vision and Pattern Recognition · Computer Science 2023-10-10 Yuan Yin , Yifei Huang , Ryosuke Furuta , Yoichi Sato