Related papers: Something-Else: Compositional Action Recognition w…

Modelling Spatio-Temporal Interactions For Compositional Action Recognition

Humans have the natural ability to recognize actions even if the objects involved in the action or the background are changed. Humans can abstract away the action from the appearance of the objects which is referred to as compositionality…

Computer Vision and Pattern Recognition · Computer Science 2024-10-28 Ramanathan Rajendiran , Debaditya Roy , Basura Fernando

Revisiting spatio-temporal layouts for compositional action recognition

Recognizing human actions is fundamentally a spatio-temporal reasoning problem, and should be, at least to some extent, invariant to the appearance of the human and the objects involved. Motivated by this hypothesis, in this work, we take…

Computer Vision and Pattern Recognition · Computer Science 2021-11-04 Gorjan Radevski , Marie-Francine Moens , Tinne Tuytelaars

SAFCAR: Structured Attention Fusion for Compositional Action Recognition

We present a general framework for compositional action recognition -- i.e. action recognition where the labels are composed out of simpler components such as subjects, atomic-actions and objects. The main challenge in compositional action…

Computer Vision and Pattern Recognition · Computer Science 2020-12-21 Tae Soo Kim , Gregory D. Hager

A Grammatical Compositional Model for Video Action Detection

Analysis of human actions in videos demands understanding complex human dynamics, as well as the interaction between actors and context. However, these interaction relationships usually exhibit large intra-class variations from diverse…

Computer Vision and Pattern Recognition · Computer Science 2023-10-05 Zhijun Zhang , Xu Zou , Jiahuan Zhou , Sheng Zhong , Ying Wu

Compositional Learning in Transformer-Based Human-Object Interaction Detection

Human-object interaction (HOI) detection is an important part of understanding human activities and visual scenes. The long-tailed distribution of labeled instances is a primary challenge in HOI detection, promoting research in few-shot and…

Computer Vision and Pattern Recognition · Computer Science 2023-08-14 Zikun Zhuang , Ruihao Qian , Chi Xie , Shuang Liang

Compositional Structure Learning for Action Understanding

The focus of the action understanding literature has predominately been classification, how- ever, there are many applications demanding richer action understanding such as mobile robotics and video search, with solutions to classification,…

Computer Vision and Pattern Recognition · Computer Science 2014-10-23 Ran Xu , Gang Chen , Caiming Xiong , Wei Chen , Jason J. Corso

C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition

Compositional actions consist of dynamic (verbs) and static (objects) concepts. Humans can easily recognize unseen compositions using the learned concepts. For machines, solving such a problem requires a model to recognize unseen actions…

Computer Vision and Pattern Recognition · Computer Science 2024-07-22 Rongchang Li , Zhenhua Feng , Tianyang Xu , Linze Li , Xiao-Jun Wu , Muhammad Awais , Sara Atito , Josef Kittler

Reasoning About Human-Object Interactions Through Dual Attention Networks

Objects are entities we act upon, where the functionality of an object is determined by how we interact with it. In this work we propose a Dual Attention Network model which reasons about human-object interactions. The dual-attentional…

Computer Vision and Pattern Recognition · Computer Science 2019-09-12 Tete Xiao , Quanfu Fan , Dan Gutfreund , Mathew Monfort , Aude Oliva , Bolei Zhou

Compositional diversity in visual concept learning

Humans leverage compositionality to efficiently learn new concepts, understanding how familiar parts can combine together to form novel objects. In contrast, popular computer vision models struggle to make the same types of inferences,…

Computer Vision and Pattern Recognition · Computer Science 2023-06-01 Yanli Zhou , Reuben Feinman , Brenden M. Lake

Action Recognition based on Cross-Situational Action-object Statistics

Machine learning models of visual action recognition are typically trained and tested on data from specific situations where actions are associated with certain objects. It is an open question how action-object associations in the training…

Computer Vision and Pattern Recognition · Computer Science 2022-08-16 Satoshi Tsutsui , Xizi Wang , Guangyuan Weng , Yayun Zhang , David Crandall , Chen Yu

Development of Compositionality and Generalization through Interactive Learning of Language and Action of Robots

Humans excel at applying learned behavior to unlearned situations. A crucial component of this generalization behavior is our ability to compose/decompose a whole into reusable parts, an attribute known as compositionality. One of the…

Artificial Intelligence · Computer Science 2024-07-24 Prasanna Vijayaraghavan , Jeffrey Frederic Queisser , Sergio Verduzco Flores , Jun Tani

Learning Latent Spatio-Temporal Compositional Model for Human Action Recognition

Action recognition is an important problem in multimedia understanding. This paper addresses this problem by building an expressive compositional action model. We model one action instance in the video with an ensemble of spatio-temporal…

Computer Vision and Pattern Recognition · Computer Science 2015-02-03 Xiaodan Liang , Liang Lin , Liangliang Cao

Attentive Action and Context Factorization

We propose a method for human action recognition, one that can localize the spatiotemporal regions that `define' the actions. This is a challenging task due to the subtlety of human actions in video and the co-occurrence of contextual…

Computer Vision and Pattern Recognition · Computer Science 2019-04-12 Yang Wang , Vinh Tran , Gedas Bertasius , Lorenzo Torresani , Minh Hoai

Video action detection by learning graph-based spatio-temporal interactions

Action Detection is a complex task that aims to detect and classify human actions in video clips. Typically, it has been addressed by processing fine-grained features extracted from a video classification backbone. Recently, thanks to the…

Computer Vision and Pattern Recognition · Computer Science 2021-03-02 Matteo Tomei , Lorenzo Baraldi , Simone Calderara , Simone Bronzin , Rita Cucchiara

Seeing What You're Told: Sentence-Guided Activity Recognition In Video

We present a system that demonstrates how the compositional structure of events, in concert with the compositional structure of language, can interplay with the underlying focusing mechanisms in video action recognition, thereby providing a…

Computer Vision and Pattern Recognition · Computer Science 2014-05-29 N. Siddharth , Andrei Barbu , Jeffrey Mark Siskind

Spatio-temporal Action Recognition: A Survey

The task of action recognition or action detection involves analyzing videos and determining what action or motion is being performed. The primary subject of these videos are predominantly humans performing some action. However, this…

Computer Vision and Pattern Recognition · Computer Science 2019-01-29 Amlaan Bhoi

Zero-Shot Action Recognition from Diverse Object-Scene Compositions

This paper investigates the problem of zero-shot action recognition, in the setting where no training videos with seen actions are available. For this challenging scenario, the current leading approach is to transfer knowledge from the…

Computer Vision and Pattern Recognition · Computer Science 2021-10-27 Carlo Bretti , Pascal Mettes

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics

With the availability of egocentric 3D hand-object interaction datasets, there is increasing interest in developing unified models for hand-object pose estimation and action recognition. However, existing methods still struggle to recognise…

Computer Vision and Pattern Recognition · Computer Science 2025-01-14 Tze Ho Elden Tse , Runyang Feng , Linfang Zheng , Jiho Park , Yixing Gao , Jihie Kim , Ales Leonardis , Hyung Jin Chang

Action2Activity: Recognizing Complex Activities from Sensor Data

As compared to simple actions, activities are much more complex, but semantically consistent with a human's real life. Techniques for action recognition from sensor generated data are mature. However, there has been relatively little work…

Computer Vision and Pattern Recognition · Computer Science 2016-11-08 Ye Liu , Liqiang Nie , Lei Han , Luming Zhang , David S Rosenblum

TEACH: Temporal Action Composition for 3D Humans

Given a series of natural language descriptions, our task is to generate 3D human motions that correspond semantically to the text, and follow the temporal order of the instructions. In particular, our goal is to enable the synthesis of a…

Computer Vision and Pattern Recognition · Computer Science 2022-09-13 Nikos Athanasiou , Mathis Petrovich , Michael J. Black , Gül Varol