Related papers: Technical Report: Temporal Aggregate Representatio…

Temporal Aggregate Representations for Long-Range Video Understanding

Future prediction, especially in long-range videos, requires reasoning from current and past observations. In this work, we address questions of temporal extent, scaling, and level of semantic abstraction with a flexible multi-granular…

Computer Vision and Pattern Recognition · Computer Science 2020-08-03 Fadime Sener , Dipika Singhania , Angela Yao

Exploring Temporal Granularity in Self-Supervised Video Representation Learning

This work presents a self-supervised learning framework named TeG to explore Temporal Granularity in learning video representations. In TeG, we sample a long clip from a video and a short clip that lies inside the long clip. We then extract…

Computer Vision and Pattern Recognition · Computer Science 2021-12-09 Rui Qian , Yeqing Li , Liangzhe Yuan , Boqing Gong , Ting Liu , Matthew Brown , Serge Belongie , Ming-Hsuan Yang , Hartwig Adam , Yin Cui

A Survey on Temporal Graph Representation Learning and Generative Modeling

Temporal graphs represent the dynamic relationships among entities and occur in many real life application like social networks, e commerce, communication, road networks, biological systems, and many more. They necessitate research beyond…

Machine Learning · Computer Science 2022-08-26 Shubham Gupta , Srikanta Bedathur

Video Understanding: Through A Temporal Lens

This thesis explores the central question of how to leverage temporal relations among video elements to advance video understanding. Addressing the limitations of existing methods, the work presents a five-fold contribution: (1) an…

Computer Vision and Pattern Recognition · Computer Science 2026-04-06 Thong Thanh Nguyen

Learning Temporal Embeddings for Complex Video Analysis

In this paper, we propose to learn temporal embeddings of video frames for complex video analysis. Large quantities of unlabeled video data can be easily obtained from the Internet. These videos possess the implicit weak label that they are…

Computer Vision and Pattern Recognition · Computer Science 2015-05-05 Vignesh Ramanathan , Kevin Tang , Greg Mori , Li Fei-Fei

Temporal Reasoning Graph for Activity Recognition

Despite great success has been achieved in activity analysis, it still has many challenges. Most existing work in activity recognition pay more attention to design efficient architecture or video sampling strategy. However, due to the…

Computer Vision and Pattern Recognition · Computer Science 2019-08-28 Jingran Zhang , Fumin Shen , Xing Xu , Heng Tao Shen

Representations and Ensemble Methods for Dynamic Relational Classification

Temporal networks are ubiquitous and evolve over time by the addition, deletion, and changing of links, nodes, and attributes. Although many relational datasets contain temporal information, the majority of existing techniques in relational…

Artificial Intelligence · Computer Science 2011-11-23 Ryan A. Rossi , Jennifer Neville

Fine-Grained Temporal Relation Extraction

We present a novel semantic framework for modeling temporal relations and event durations that maps pairs of events to real-valued scales. We use this framework to construct the largest temporal relations dataset to date, covering the…

Computation and Language · Computer Science 2019-06-05 Siddharth Vashishtha , Benjamin Van Durme , Aaron Steven White

How Much Temporal Long-Term Context is Needed for Action Segmentation?

Modeling long-term context in videos is crucial for many fine-grained tasks including temporal action segmentation. An interesting question that is still open is how much long-term temporal context is needed for optimal performance. While…

Computer Vision and Pattern Recognition · Computer Science 2023-09-26 Emad Bahrami , Gianpiero Francesca , Juergen Gall

Enabling Interactivity on Displays of Multivariate Time Series and Longitudinal Data

Temporal data is information measured in the context of time. This contextual structure provides components that need to be explored to understand the data and that can form the basis of interactions applied to the plots. In multivariate…

Computation · Statistics 2014-12-23 Xiaoyue Cheng , Dianne Cook , Heike Hofmann

Subtopic-aware View Sampling and Temporal Aggregation for Long-form Document Matching

Long-form document matching aims to judge the relevance between two documents and has been applied to various scenarios. Most existing works utilize hierarchical or long context models to process documents, which achieve coarse…

Information Retrieval · Computer Science 2024-12-25 Youchao Zhou , Heyan Huang , Zhijing Wu , Yuhang Liu , Xinglin Wang

Long-Term Feature Banks for Detailed Video Understanding

To understand the world, we humans constantly need to relate the present to the past, and put events in context. In this paper, we enable existing video models to do the same. We propose a long-term feature bank---supportive information…

Computer Vision and Pattern Recognition · Computer Science 2019-04-19 Chao-Yuan Wu , Christoph Feichtenhofer , Haoqi Fan , Kaiming He , Philipp Krähenbühl , Ross Girshick

Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling

While recent large-scale video-language pre-training made great progress in video question answering, the design of spatial modeling of video-language models is less fine-grained than that of image-language models; existing practices of…

Computer Vision and Pattern Recognition · Computer Science 2022-10-11 Hsin-Ying Lee , Hung-Ting Su , Bing-Chen Tsai , Tsung-Han Wu , Jia-Fong Yeh , Winston H. Hsu

Temporally Consistent Transformers for Video Generation

To generate accurate videos, algorithms have to understand the spatial and temporal dependencies in the world. Current algorithms enable accurate predictions over short horizons but tend to suffer from temporal inconsistencies. When…

Computer Vision and Pattern Recognition · Computer Science 2023-06-02 Wilson Yan , Danijar Hafner , Stephen James , Pieter Abbeel

Temporal Query Networks for Fine-grained Video Understanding

Our objective in this work is fine-grained classification of actions in untrimmed videos, where the actions may be temporally extended or may span only a few frames of the video. We cast this into a query-response mechanism, where each…

Computer Vision and Pattern Recognition · Computer Science 2021-04-20 Chuhan Zhang , Ankush Gupta , Andrew Zisserman

Snapshot Semantics for Temporal Multiset Relations (Extended Version)

Snapshot semantics is widely used for evaluating queries over temporal data: temporal relations are seen as sequences of snapshot relations, and queries are evaluated at each snapshot. In this work, we demonstrate that current approaches…

Databases · Computer Science 2019-02-14 Anton Dignös , Boris Glavic , Xing Niu , Michael Böhlen , Johann Gamper

EgoGraph: Temporal Knowledge Graph for Egocentric Video Understanding

Ultra-long egocentric videos spanning multiple days present significant challenges for video understanding. Existing approaches still rely on fragmented local processing and limited temporal modeling, restricting their ability to reason…

Computer Vision and Pattern Recognition · Computer Science 2026-03-02 Shitong Sun , Ke Han , Yukai Huang , Weitong Cai , Jifei Song

RAG Meets Temporal Graphs: Time-Sensitive Modeling and Retrieval for Evolving Knowledge

Knowledge is inherently time-sensitive and continuously evolves over time. Although current Retrieval-Augmented Generation (RAG) systems enrich LLMs with external knowledge, they largely ignore this temporal nature. This raises two…

Information Retrieval · Computer Science 2025-10-16 Jiale Han , Austin Cheung , Yubai Wei , Zheng Yu , Xusheng Wang , Bing Zhu , Yi Yang

Video Diffusion Models

Generating temporally coherent high fidelity video is an important milestone in generative modeling research. We make progress towards this milestone by proposing a diffusion model for video generation that shows very promising initial…

Computer Vision and Pattern Recognition · Computer Science 2022-06-24 Jonathan Ho , Tim Salimans , Alexey Gritsenko , William Chan , Mohammad Norouzi , David J. Fleet

Video Is Graph: Structured Graph Module for Video Action Recognition

In the field of action recognition, video clips are always treated as ordered frames for subsequent processing. To achieve spatio-temporal perception, existing approaches propose to embed adjacent temporal interaction in the convolutional…

Computer Vision and Pattern Recognition · Computer Science 2022-02-01 Rongchang Li , Xiao-Jun Wu , Tianyang Xu