English
Related papers

Related papers: Pretext Training Algorithms for Event Sequence Dat…

200 papers

Recently, pretext-task based methods are proposed one after another in self-supervised video feature learning. Meanwhile, contrastive learning methods also yield good performance. Usually, new methods can beat previous ones as claimed that…

Computer Vision and Pattern Recognition · Computer Science 2021-04-06 Li Tao , Xueting Wang , Toshihiko Yamasaki

Abnormal event detection in videos is a challenging problem, partly due to the multiplicity of abnormal patterns and the lack of their corresponding annotations. In this paper, we propose new constrained pretext tasks to learn object level…

Computer Vision and Pattern Recognition · Computer Science 2023-04-25 Yassine Naji , Aleksandr Setkov , Angélique Loesch , Michèle Gouiffès , Romaric Audigier

Self-supervision is one of the hallmarks of representation learning in the increasingly popular suite of foundation models including large language models such as BERT and GPT-3, but it has not been pursued in the context of multivariate…

Machine Learning · Computer Science 2024-02-05 Xiao Shou , Dharmashankar Subramanian , Debarun Bhattacharjya , Tian Gao , Kristin P. Bennet

Foundation models have recently gained attention within the field of machine learning thanks to its efficiency in broad data processing. While researchers had attempted to extend this success to time series models, the main challenge is…

Machine Learning · Computer Science 2023-11-22 Trang H. Tran , Lam M. Nguyen , Kyongmin Yeo , Nam Nguyen , Roman Vaculin

Word alignment, which aims to align translationally equivalent words between source and target sentences, plays an important role in many natural language processing tasks. Current unsupervised neural alignment methods focus on inducing…

Computation and Language · Computer Science 2021-05-18 Chi Chen , Maosong Sun , Yang Liu

This paper proposes a pre-trained neural network for handling event camera data. Our model is a self-supervised learning framework, and uses paired event camera data and natural RGB images for training. Our method contains three modules…

Computer Vision and Pattern Recognition · Computer Science 2023-07-21 Yan Yang , Liyuan Pan , Liu Liu

In this paper, we investigate self-supervised pre-training methods for document text recognition. Nowadays, large unlabeled datasets can be collected for many research tasks, including text recognition, but it is costly to annotate them.…

Computer Vision and Pattern Recognition · Computer Science 2024-05-02 Martin Kišš , Michal Hradiš

Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion. Existing pretraining…

Self-supervised learning has drawn attention through its effectiveness in learning in-domain representations with no ground-truth annotations; in particular, it is shown that properly designed pretext tasks (e.g., contrastive prediction…

Computer Vision and Pattern Recognition · Computer Science 2022-01-17 Jonghwan Mun , Minchul Shin , Gunsoo Han , Sangho Lee , Seongsu Ha , Joonseok Lee , Eun-Sol Kim

Video-and-language pre-training has shown promising improvements on various downstream tasks. Most previous methods capture cross-modal interactions with a transformer-based multimodal encoder, not fully addressing the misalignment between…

Computer Vision and Pattern Recognition · Computer Science 2021-12-24 Dongxu Li , Junnan Li , Hongdong Li , Juan Carlos Niebles , Steven C. H. Hoi

Conformal prediction is a powerful distribution-free tool for uncertainty quantification, establishing valid prediction intervals with finite-sample guarantees. To produce valid intervals which are also adaptive to the difficulty of each…

Machine Learning · Computer Science 2023-02-24 Nabeel Seedat , Alan Jeffares , Fergus Imrie , Mihaela van der Schaar

Through solving pretext tasks, self-supervised learning leverages unlabeled data to extract useful latent representations replacing traditional input features in the downstream task. In audio/speech signal processing, a wide range of…

Audio and Speech Processing · Electrical Eng. & Systems 2022-11-23 Salah Zaiem , Titouan Parcollet , Slim Essid , Abdel Heba

This paper introduces a self-supervised learning framework designed for pre-training neural networks tailored to dense prediction tasks using event camera data. Our approach utilizes solely event data for training. Transferring achievements…

Computer Vision and Pattern Recognition · Computer Science 2024-09-24 Yan Yang , Liyuan Pan , Liu Liu

Building an intelligent dialogue system with the ability to select a proper response according to a multi-turn context is a great challenging task. Existing studies focus on building a context-response matching model with various neural…

Computation and Language · Computer Science 2020-09-15 Ruijian Xu , Chongyang Tao , Daxin Jiang , Xueliang Zhao , Dongyan Zhao , Rui Yan

Deep neural networks for time series must capture complex temporal patterns, to effectively represent dynamic data. Self- and semi-supervised learning methods show promising results in pre-training large models, which -- when finetuned for…

Machine Learning · Computer Science 2025-08-15 Yuhan Xie , William Cappelletti , Mahsa Shoaran , Pascal Frossard

Event mentions in text correspond to real-world events of varying degrees of granularity. The task of subevent detection aims to resolve this granularity issue, recognizing the membership of multi-granular events in event complexes. Since…

Computation and Language · Computer Science 2021-09-15 Haoyu Wang , Hongming Zhang , Muhao Chen , Dan Roth

Self-supervised learning has emerged as a powerful tool for pretraining deep networks on unlabeled data, prior to transfer learning of target tasks with limited annotation. The relevance between the pretraining pretext and target tasks is…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Tianwei Zhang , Dong Wei , Mengmeng Zhu , Shi Gu , Yefeng Zheng

Dense video captioning aims to generate corresponding text descriptions for a series of events in the untrimmed video, which can be divided into two sub-tasks, event detection and event captioning. Unlike previous works that tackle the two…

Computer Vision and Pattern Recognition · Computer Science 2023-07-24 Qi Zhang , Yuqing Song , Qin Jin

This doctoral thesis improves the transfer learning for sequence labeling tasks by adapting pre-trained neural language models. The proposed improvements in transfer learning involve introducing a multi-task model that incorporates an…

Computation and Language · Computer Science 2025-10-24 David Dukić

Video-Language Pre-training models have recently significantly improved various multi-modal downstream tasks. Previous dominant works mainly adopt contrastive learning to achieve global feature alignment across modalities. However, the…

Computer Vision and Pattern Recognition · Computer Science 2023-01-19 Fan Ma , Xiaojie Jin , Heng Wang , Jingjia Huang , Linchao Zhu , Jiashi Feng , Yi Yang
‹ Prev 1 2 3 10 Next ›