English
Related papers

Related papers: Value Explicit Pretraining for Learning Transferab…

200 papers

While Reinforcement Learning (RL) agents can successfully learn to handle complex tasks, effectively generalizing acquired skills to unfamiliar settings remains a challenge. One of the reasons behind this is the visual encoders used are…

Computer Vision and Pattern Recognition · Computer Science 2025-02-11 Yuhan Zhang , Guoqing Ma , Guangfu Hao , Liangxuan Guo , Yang Chen , Shan Yu

Reward and representation learning are two long-standing challenges for learning an expanding set of robot manipulation skills from sensory observations. Given the inherent cost and scarcity of in-domain, task-specific robot data, learning…

Robotics · Computer Science 2023-03-08 Yecheng Jason Ma , Shagun Sodhani , Dinesh Jayaraman , Osbert Bastani , Vikash Kumar , Amy Zhang

The learning of Transformation-Equivariant Representations (TERs), which is introduced by Hinton et al. \cite{hinton2011transforming}, has been considered as a principle to reveal visual structures under various transformations. It contains…

Computer Vision and Pattern Recognition · Computer Science 2019-07-24 Guo-Jun Qi , Liheng Zhang , Chang Wen Chen , Qi Tian

The Vision Transformer architecture has shown to be competitive in the computer vision (CV) space where it has dethroned convolution-based networks in several benchmarks. Nevertheless, convolutional neural networks (CNN) remain the…

Machine Learning · Computer Science 2023-07-20 Manuel Goulão , Arlindo L. Oliveira

Transformers have gained increasing popularity in a wide range of applications, including Natural Language Processing (NLP), Computer Vision and Speech Recognition, because of their powerful representational capacity. However, harnessing…

Tremendous progress has been made in visual representation learning, notably with the recent success of self-supervised contrastive learning methods. Supervised contrastive learning has also been shown to outperform its cross-entropy…

Computer Vision and Pattern Recognition · Computer Science 2021-08-17 Ashraful Islam , Chun-Fu Chen , Rameswar Panda , Leonid Karlinsky , Richard Radke , Rogerio Feris

Visual navigation is a task of training an embodied agent by intelligently navigating to a target object (e.g., television) using only visual observations. A key challenge for current deep reinforcement learning models lies in the…

Computer Vision and Pattern Recognition · Computer Science 2020-04-07 Juncheng Li , Xin Wang , Siliang Tang , Haizhou Shi , Fei Wu , Yueting Zhuang , William Yang Wang

A mainstream type of current self-supervised learning methods pursues a general-purpose representation that can be well transferred to downstream tasks, typically by optimizing on a given pretext task such as instance discrimination. In…

Computer Vision and Pattern Recognition · Computer Science 2022-10-21 Xin Liu , Zhongdao Wang , Yali Li , Shengjin Wang

Prompt learning has achieved great success in efficiently exploiting large-scale pre-trained models in natural language processing (NLP). It reformulates the downstream tasks as the generative pre-training ones to achieve consistency, thus…

Computer Vision and Pattern Recognition · Computer Science 2023-12-18 Ning Liao , Bowen Shi , Xiaopeng Zhang , Min Cao , Junchi Yan , Qi Tian

In this paper, we investigate self-supervised pre-training methods for document text recognition. Nowadays, large unlabeled datasets can be collected for many research tasks, including text recognition, but it is costly to annotate them.…

Computer Vision and Pattern Recognition · Computer Science 2024-05-02 Martin Kišš , Michal Hradiš

We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. Unlike existing visual pre-training methods, which solve a proxy…

Computer Vision and Pattern Recognition · Computer Science 2021-04-28 Xin Yuan , Zhe Lin , Jason Kuen , Jianming Zhang , Yilin Wang , Michael Maire , Ajinkya Kale , Baldo Faieta

Pre-training for Reinforcement Learning (RL) with purely video data is a valuable yet challenging problem. Although in-the-wild videos are readily available and inhere a vast amount of prior world knowledge, the absence of action…

Computer Vision and Pattern Recognition · Computer Science 2024-11-06 Hao Luo , Bohan Zhou , Zongqing Lu

While visual imitation learning offers one of the most effective ways of learning from visual demonstrations, generalizing from them requires either hundreds of diverse demonstrations, task specific priors, or large, hard-to-train…

Robotics · Computer Science 2021-12-07 Jyothish Pari , Nur Muhammad Shafiullah , Sridhar Pandian Arunachalam , Lerrel Pinto

Specifying reward signals that allow agents to learn complex behaviors is a long-standing challenge in reinforcement learning. A promising approach is to extract preferences for behaviors from unlabeled videos, which are widely available on…

Recent researches on unsupervised person re-identification~(reID) have demonstrated that pre-training on unlabeled person images achieves superior performance on downstream reID tasks than pre-training on ImageNet. However, those…

Computer Vision and Pattern Recognition · Computer Science 2023-04-13 Liping Bao , Longhui Wei , Xiaoyu Qiu , Wengang Zhou , Houqiang Li , Qi Tian

Procedural video representation learning is an active research area where the objective is to learn an agent which can anticipate and forecast the future given the present video input, typically in conjunction with textual annotations.…

Computer Vision and Pattern Recognition · Computer Science 2024-10-07 Han Lin , Tushar Nagarajan , Nicolas Ballas , Mido Assran , Mojtaba Komeili , Mohit Bansal , Koustuv Sinha

Visual model-based RL methods typically encode image observations into low-dimensional representations in a manner that does not eliminate redundant information. This leaves them susceptible to spurious variations -- changes in…

Machine Learning · Computer Science 2023-10-26 Chuning Zhu , Max Simchowitz , Siri Gadipudi , Abhishek Gupta

Deep reinforcement learning (RL) algorithms suffer severe performance degradation when the interaction data is scarce, which limits their real-world application. Recently, visual representation learning has been shown to be effective and…

Machine Learning · Computer Science 2022-08-17 Yang Yue , Bingyi Kang , Zhongwen Xu , Gao Huang , Shuicheng Yan

In reinforcement learning (RL), value-based algorithms learn to associate each observation with the states and rewards that are likely to be reached from it. We observe that many self-supervised image pre-training methods bear similarity to…

Machine Learning · Computer Science 2025-06-16 Dibya Ghosh , Sergey Levine

Learning to navigate in a visual environment following natural-language instructions is a challenging task, because the multimodal inputs to the agent are highly variable, and the training data on a new task is often limited. In this paper,…

Computer Vision and Pattern Recognition · Computer Science 2020-04-07 Weituo Hao , Chunyuan Li , Xiujun Li , Lawrence Carin , Jianfeng Gao
‹ Prev 1 2 3 10 Next ›