Related papers: Transformers in Action: Weakly Supervised Action S…

On Evaluating Weakly Supervised Action Segmentation Methods

Action segmentation is the task of temporally segmenting every frame of an untrimmed video. Weakly supervised approaches to action segmentation, especially from transcripts have been of considerable interest to the computer vision…

Computer Vision and Pattern Recognition · Computer Science 2021-10-22 Yaser Souri , Alexander Richard , Luca Minciullo , Juergen Gall

Weakly supervised learning of actions from transcripts

We present an approach for weakly supervised learning of human actions from video transcriptions. Our system is based on the idea that, given a sequence of input data and a transcript, i.e. a list of the order the actions occur in the…

Computer Vision and Pattern Recognition · Computer Science 2017-06-20 Hilde Kuehne , Alexander Richard , Juergen Gall

Efficient and Effective Weakly-Supervised Action Segmentation via Action-Transition-Aware Boundary Alignment

Weakly-supervised action segmentation is a task of learning to partition a long video into several action segments, where training videos are only accompanied by transcripts (ordered list of actions). Most of existing methods need to infer…

Computer Vision and Pattern Recognition · Computer Science 2024-03-29 Angchi Xu , Wei-Shi Zheng

SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation

Temporal action segmentation is a topic of increasing interest, however, annotating each frame in a video is cumbersome and costly. Weakly supervised approaches therefore aim at learning temporal action segmentation from videos that are…

Computer Vision and Pattern Recognition · Computer Science 2020-04-01 Mohsen Fayyaz , Juergen Gall

Action Sets: Weakly Supervised Action Segmentation without Ordering Constraints

Action detection and temporal segmentation of actions in videos are topics of increasing interest. While fully supervised systems have gained much attention lately, full annotation of each action within the video is costly and impractical…

Computer Vision and Pattern Recognition · Computer Science 2018-05-18 Alexander Richard , Hilde Kuehne , Juergen Gall

Weakly Supervised Action Learning with RNN based Fine-to-coarse Modeling

We present an approach for weakly supervised learning of human actions. Given a set of videos and an ordered list of the occurring actions, the goal is to infer start and end frames of the related action classes within the video and to…

Computer Vision and Pattern Recognition · Computer Science 2017-10-10 Alexander Richard , Hilde Kuehne , Juergen Gall

Robust Action Segmentation from Timestamp Supervision

Action segmentation is the task of predicting an action label for each frame of an untrimmed video. As obtaining annotations to train an approach for action segmentation in a fully supervised way is expensive, various approaches have been…

Computer Vision and Pattern Recognition · Computer Science 2022-10-14 Yaser Souri , Yazan Abu Farha , Emad Bahrami , Gianpiero Francesca , Juergen Gall

A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation

Action recognition has become a rapidly developing research field within the last decade. But with the increasing demand for large scale data, the need of hand annotated data for the training becomes more and more impractical. One way to…

Computer Vision and Pattern Recognition · Computer Science 2019-06-05 Hilde Kuehne , Alexander Richard , Juergen Gall

Learning Transferable Self-attentive Representations for Action Recognition in Untrimmed Videos with Weak Supervision

Action recognition in videos has attracted a lot of attention in the past decade. In order to learn robust models, previous methods usually assume videos are trimmed as short sequences and require ground-truth annotations of each video…

Computer Vision and Pattern Recognition · Computer Science 2019-02-21 Xiao-Yu Zhang , Haichao Shi , Changsheng Li , Kai Zheng , Xiaobin Zhu , Lixin Duan

ASFormer: Transformer for Action Segmentation

Algorithms for the action segmentation task typically use temporal models to predict what action is occurring at each frame for a minute-long daily activity. Recent studies have shown the potential of Transformer in modeling the relations…

Computer Vision and Pattern Recognition · Computer Science 2021-10-19 Fangqiu Yi , Hongyu Wen , Tingting Jiang

Understanding Video Transformers for Segmentation: A Survey of Application and Interpretability

Video segmentation encompasses a wide range of categories of problem formulation, e.g., object, scene, actor-action and multimodal video segmentation, for delineating task-specific scene components with pixel-level masks. Recently,…

Computer Vision and Pattern Recognition · Computer Science 2023-10-20 Rezaul Karim , Richard P. Wildes

Temporal Action Segmentation from Timestamp Supervision

Temporal action segmentation approaches have been very successful recently. However, annotating videos with frame-wise labels to train such models is very expensive and time consuming. While weakly supervised methods trained using only…

Computer Vision and Pattern Recognition · Computer Science 2021-03-29 Zhe Li , Yazan Abu Farha , Juergen Gall

Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation

This paper introduces a unified framework for video action segmentation via sequence to sequence (seq2seq) translation in a fully and timestamp supervised setup. In contrast to current state-of-the-art frame-level prediction methods, we…

Computer Vision and Pattern Recognition · Computer Science 2022-10-12 Nadine Behrmann , S. Alireza Golestaneh , Zico Kolter , Juergen Gall , Mehdi Noroozi

Spatio-Temporal Action Localization in a Weakly Supervised Setting

Enabling computational systems with the ability to localize actions in video-based content has manifold applications. Traditionally, such a problem is approached in a fully-supervised setting where video-clips with complete frame-by-frame…

Computer Vision and Pattern Recognition · Computer Science 2019-05-07 Kurt Degiorgio , Fabio Cuzzolin

Semi-Supervised Domain Adaptation for Weakly Labeled Semantic Video Object Segmentation

Deep convolutional neural networks (CNNs) have been immensely successful in many high-level computer vision tasks given large labeled datasets. However, for video semantic object segmentation, a domain where labels are scarce, effectively…

Computer Vision and Pattern Recognition · Computer Science 2016-06-08 Huiling Wang , Tapani Raiko , Lasse Lensu , Tinghuai Wang , Juha Karhunen

Hierarchical Modeling for Task Recognition and Action Segmentation in Weakly-Labeled Instructional Videos

This paper focuses on task recognition and action segmentation in weakly-labeled instructional videos, where only the ordered sequence of video-level actions is available during training. We propose a two-stream framework, which exploits…

Computer Vision and Pattern Recognition · Computer Science 2021-10-13 Reza Ghoddoosian , Saif Sayed , Vassilis Athitsos

Distill and Collect for Semi-Supervised Temporal Action Segmentation

Recent temporal action segmentation approaches need frame annotations during training to be effective. These annotations are very expensive and time-consuming to obtain. This limits their performances when only limited annotated data is…

Computer Vision and Pattern Recognition · Computer Science 2022-11-04 Sovan Biswas , Anthony Rhodes , Ramesh Manuvinakurike , Giuseppe Raffa , Richard Beckwith

Learning to Segment Actions from Observation and Narration

We apply a generative segmental model of task structure, guided by narration, to action segmentation in video. We focus on unsupervised and weakly-supervised settings where no action labels are known during training. Despite its simplicity,…

Computation and Language · Computer Science 2020-08-13 Daniel Fried , Jean-Baptiste Alayrac , Phil Blunsom , Chris Dyer , Stephen Clark , Aida Nematzadeh

Temporal Segment Transformer for Action Segmentation

Recognizing human actions from untrimmed videos is an important task in activity understanding, and poses unique challenges in modeling long-range temporal relations. Recent works adopt a predict-and-refine strategy which converts an…

Computer Vision and Pattern Recognition · Computer Science 2023-02-28 Zhichao Liu , Leshan Wang , Desen Zhou , Jian Wang , Songyang Zhang , Yang Bai , Errui Ding , Rui Fan

Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment

In this work, we address the task of weakly-supervised human action segmentation in long, untrimmed videos. Recent methods have relied on expensive learning models, such as Recurrent Neural Networks (RNN) and Hidden Markov Models (HMM).…

Computer Vision and Pattern Recognition · Computer Science 2023-05-22 Li Ding , Chenliang Xu