Related papers: SeCo: Exploring Sequence Supervision for Unsupervi…

Momentum Contrast for Unsupervised Visual Representation Learning

We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic dictionary with a queue and a moving-averaged encoder. This enables…

Computer Vision and Pattern Recognition · Computer Science 2020-03-25 Kaiming He , Haoqi Fan , Yuxin Wu , Saining Xie , Ross Girshick

VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples

MoCo is effective for unsupervised image representation learning. In this paper, we propose VideoMoCo for unsupervised video representation learning. Given a video sequence as an input sample, we improve the temporal feature representations…

Computer Vision and Pattern Recognition · Computer Science 2021-03-18 Tian Pan , Yibing Song , Tianyu Yang , Wenhao Jiang , Wei Liu

Unsupervised Video Representation Learning by Bidirectional Feature Prediction

This paper introduces a novel method for self-supervised video representation learning via feature prediction. In contrast to the previous methods that focus on future feature prediction, we argue that a supervisory signal arising from…

Computer Vision and Pattern Recognition · Computer Science 2020-11-13 Nadine Behrmann , Juergen Gall , Mehdi Noroozi

Audio-Visual Contrastive Learning with Temporal Self-Supervision

We propose a self-supervised learning approach for videos that learns representations of both the RGB frames and the accompanying audio without human supervision. In contrast to images that capture the static scene appearance, videos also…

Computer Vision and Pattern Recognition · Computer Science 2023-02-16 Simon Jenni , Alexander Black , John Collomosse

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations

Contrastive self-supervised learning has outperformed supervised pretraining on many downstream tasks like segmentation and object detection. However, current methods are still primarily applied to curated datasets like ImageNet. In this…

Computer Vision and Pattern Recognition · Computer Science 2021-12-15 Wouter Van Gansbeke , Simon Vandenhende , Stamatios Georgoulis , Luc Van Gool

Unsupervised Representation Learning by Sorting Sequences

We present an unsupervised representation learning approach using videos without semantic labels. We leverage the temporal coherence as a supervisory signal by formulating representation learning as a sequence sorting task. We take…

Computer Vision and Pattern Recognition · Computer Science 2017-08-04 Hsin-Ying Lee , Jia-Bin Huang , Maneesh Singh , Ming-Hsuan Yang

CoCon: Cooperative-Contrastive Learning

Labeling videos at scale is impractical. Consequently, self-supervised visual representation learning is key for efficient video analysis. Recent success in learning image representations suggests contrastive learning is a promising…

Computer Vision and Pattern Recognition · Computer Science 2021-05-03 Nishant Rai , Ehsan Adeli , Kuan-Hui Lee , Adrien Gaidon , Juan Carlos Niebles

SupCL-Seq: Supervised Contrastive Learning for Downstream Optimized Sequence Representations

While contrastive learning is proven to be an effective training strategy in computer vision, Natural Language Processing (NLP) is only recently adopting it as a self-supervised alternative to Masked Language Modeling (MLM) for improving…

Computation and Language · Computer Science 2021-09-16 Hooman Sedghamiz , Shivam Raval , Enrico Santus , Tuka Alhanai , Mohammad Ghassemi

ASCNet: Self-supervised Video Representation Learning with Appearance-Speed Consistency

We study self-supervised video representation learning, which is a challenging task due to 1) lack of labels for explicit supervision; 2) unstructured and noisy visual information. Existing methods mainly use contrastive loss with video…

Computer Vision and Pattern Recognition · Computer Science 2021-08-18 Deng Huang , Wenhao Wu , Weiwen Hu , Xu Liu , Dongliang He , Zhihua Wu , Xiangmiao Wu , Mingkui Tan , Errui Ding

Hierarchical Contrast for Unsupervised Skeleton-based Action Representation Learning

This paper targets unsupervised skeleton-based action representation learning and proposes a new Hierarchical Contrast (HiCo) framework. Different from the existing contrastive-based solutions that typically represent an input skeleton…

Computer Vision and Pattern Recognition · Computer Science 2022-12-06 Jianfeng Dong , Shengkai Sun , Zhonglin Liu , Shujie Chen , Baolong Liu , Xun Wang

Self-supervised and Weakly Supervised Contrastive Learning for Frame-wise Action Representations

Previous work on action representation learning focused on global representations for short video clips. In contrast, many practical applications, such as video alignment, strongly demand learning the intensive representation of long…

Computer Vision and Pattern Recognition · Computer Science 2023-03-03 Minghao Chen , Renbo Tu , Chenxi Huang , Yuqi Lin , Boxi Wu , Deng Cai

Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency

Natural videos provide rich visual contents for self-supervised learning. Yet most existing approaches for learning spatio-temporal representations rely on manually trimmed videos, leading to limited diversity in visual patterns and limited…

Computer Vision and Pattern Recognition · Computer Science 2022-04-08 Zhiwu Qing , Shiwei Zhang , Ziyuan Huang , Yi Xu , Xiang Wang , Mingqian Tang , Changxin Gao , Rong Jin , Nong Sang

Time-Equivariant Contrastive Video Representation Learning

We introduce a novel self-supervised contrastive learning method to learn representations from unlabelled videos. Existing approaches ignore the specifics of input distortions, e.g., by learning invariance to temporal transformations.…

Computer Vision and Pattern Recognition · Computer Science 2021-12-08 Simon Jenni , Hailin Jin

Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning

As a pioneering work, PointContrast conducts unsupervised 3D representation learning via leveraging contrastive learning over raw RGB-D frames and proves its effectiveness on various downstream tasks. However, the trend of large-scale…

Computer Vision and Pattern Recognition · Computer Science 2023-03-27 Xiaoyang Wu , Xin Wen , Xihui Liu , Hengshuang Zhao

Unsupervised Video Understanding by Reconciliation of Posture Similarities

Understanding human activity and being able to explain it in detail surpasses mere action classification by far in both complexity and value. The challenge is thus to describe an activity on the basis of its most fundamental constituents,…

Computer Vision and Pattern Recognition · Computer Science 2017-08-04 Timo Milbich , Miguel Bautista , Ekaterina Sutter , Bjorn Ommer

Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework

We propose a self-supervised method to learn feature representations from videos. A standard approach in traditional self-supervised methods uses positive-negative data pairs to train with contrastive learning strategy. In such a case,…

Computer Vision and Pattern Recognition · Computer Science 2020-08-13 Li Tao , Xueting Wang , Toshihiko Yamasaki

No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

Self-supervised approaches for video have shown impressive results in video understanding tasks. However, unlike early works that leverage temporal self-supervision, current state-of-the-art methods primarily rely on tasks from the image…

Computer Vision and Pattern Recognition · Computer Science 2023-12-21 Ishan Rajendrakumar Dave , Simon Jenni , Mubarak Shah

Hierarchically Decoupled Spatial-Temporal Contrast for Self-supervised Video Representation Learning

We present a novel technique for self-supervised video representation learning by: (a) decoupling the learning objective into two contrastive subtasks respectively emphasizing spatial and temporal features, and (b) performing it…

Computer Vision and Pattern Recognition · Computer Science 2021-09-02 Zehua Zhang , David Crandall

Contrastive Separative Coding for Self-supervised Representation Learning

To extract robust deep representations from long sequential modeling of speech data, we propose a self-supervised learning approach, namely Contrastive Separative Coding (CSC). Our key finding is to learn such representations by separating…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-02 Jun Wang , Max W. Y. Lam , Dan Su , Dong Yu

Sequence-to-Sequence Contrastive Learning for Text Recognition

We propose a framework for sequence-to-sequence contrastive learning (SeqCLR) of visual representations, which we apply to text recognition. To account for the sequence-to-sequence structure, each feature map is divided into different…

Computer Vision and Pattern Recognition · Computer Science 2020-12-22 Aviad Aberdam , Ron Litman , Shahar Tsiper , Oron Anschel , Ron Slossberg , Shai Mazor , R. Manmatha , Pietro Perona