Related papers: Self-Consistent Model-based Adaptation for Visual …

Salience-Invariant Consistent Policy Learning for Generalization in Visual Reinforcement Learning

Generalizing policies to unseen scenarios remains a critical challenge in visual reinforcement learning, where agents often overfit to the specific visual observations of the training environment. In unseen environments, distracting pixels…

Artificial Intelligence · Computer Science 2025-02-25 Jingbo Sun , Songjun Tu , Qichao Zhang , Ke Chen , Dongbin Zhao

Self-Correcting VLA: Online Action Refinement via Sparse World Imagination

Standard vision-language-action (VLA) models rely on fitting statistical data priors, limiting their robust understanding of underlying physical dynamics. Reinforcement learning enhances physical grounding through exploration yet typically…

Robotics · Computer Science 2026-02-26 Chenyv Liu , Wentao Tan , Lei Zhu , Fengling Li , Jingjing Li , Guoli Yang , Heng Tao Shen

Self-Supervised Correspondence in Visuomotor Policy Learning

In this paper we explore using self-supervised correspondence for improving the generalization performance and sample efficiency of visuomotor policy learning. Prior work has primarily used approaches such as autoencoding, pose-based…

Robotics · Computer Science 2019-09-17 Peter Florence , Lucas Manuelli , Russ Tedrake

Learning Image Deraining Transformer Network with Dynamic Dual Self-Attention

Recently, Transformer-based architecture has been introduced into single image deraining task due to its advantage in modeling non-local information. However, existing approaches tend to integrate global features based on a dense…

Computer Vision and Pattern Recognition · Computer Science 2023-08-16 Zhentao Fan , Hongming Chen , Yufeng Li

Symmetry-Guided Memory Augmentation for Efficient Locomotion Learning

Training reinforcement learning (RL) policies for legged locomotion often requires extensive environment interactions, which are costly and time-consuming. We propose Symmetry-Guided Memory Augmentation (SGMA), a framework that improves…

Machine Learning · Computer Science 2026-03-26 Kaixi Bao , Chenhao Li , Yarden As , Andreas Krause , Marco Hutter

Learning Robust and Adaptive Real-World Continuous Control Using Simulation and Transfer Learning

We use model-free reinforcement learning, extensive simulation, and transfer learning to develop a continuous control algorithm that has good zero-shot performance in a real physical environment. We train a simulated agent to act optimally…

Artificial Intelligence · Computer Science 2018-03-09 M Ferguson , K. H. Law

GTMA: Dynamic Representation Optimization for OOD Vision-Language Models

Vision-language models (VLMs) struggle in open-world applications, where out-of-distribution (OOD) concepts often trigger cross-modal alignment collapse and severely degrade zero-shot performance. We identify the root cause as modal…

Computer Vision and Pattern Recognition · Computer Science 2025-12-23 Jensen Zhang , Ningyuan Liu , Keze Wang

Generalization Across Observation Shifts in Reinforcement Learning

Learning policies which are robust to changes in the environment are critical for real world deployment of Reinforcement Learning agents. They are also necessary for achieving good generalization across environment shifts. We focus on…

Machine Learning · Computer Science 2023-06-08 Anuj Mahajan , Amy Zhang

A Focused Dynamic Attention Model for Visual Question Answering

Visual Question and Answering (VQA) problems are attracting increasing interest from multiple research disciplines. Solving VQA problems requires techniques from both computer vision for understanding the visual contents of a presented…

Computer Vision and Pattern Recognition · Computer Science 2016-04-07 Ilija Ilievski , Shuicheng Yan , Jiashi Feng

Learning To Simulate

Simulation is a useful tool in situations where training data for machine learning models is costly to annotate or even hard to acquire. In this work, we propose a reinforcement learning-based method for automatically adjusting the…

Machine Learning · Computer Science 2019-05-15 Nataniel Ruiz , Samuel Schulter , Manmohan Chandraker

MoDA: Map style transfer for self-supervised Domain Adaptation of embodied agents

We propose a domain adaptation method, MoDA, which adapts a pretrained embodied agent to a new, noisy environment without ground-truth supervision. Map-based memory provides important contextual information for visual navigation, and…

Robotics · Computer Science 2022-11-30 Eun Sun Lee , Junho Kim , SangWon Park , Young Min Kim

Sequential Action-Induced Invariant Representation for Reinforcement Learning

How to accurately learn task-relevant state representations from high-dimensional observations with visual distractions is a realistic and challenging problem in visual reinforcement learning. Recently, unsupervised representation learning…

Machine Learning · Computer Science 2023-09-25 Dayang Liang , Qihang Chen , Yunlong Liu

Improving generalization of robot locomotion policies via Sharpness-Aware Reinforcement Learning

Reinforcement learning often requires extensive training data. Simulation-to-real transfer offers a promising approach to address this challenge in robotics. While differentiable simulators offer improved sample efficiency through exact…

Robotics · Computer Science 2024-12-02 Severin Bochem , Eduardo Gonzalez-Sanchez , Yves Bicker , Gabriele Fadini

SC-ML: Self-supervised Counterfactual Metric Learning for Debiased Visual Question Answering

Visual question answering (VQA) is a critical multimodal task in which an agent must answer questions according to the visual cue. Unfortunately, language bias is a common problem in VQA, which refers to the model generating answers only by…

Computer Vision and Pattern Recognition · Computer Science 2023-04-05 Xinyao Shu , Shiyang Yan , Xu Yang , Ziheng Wu , Zhongfeng Chen , Zhenyu Lu

Self-supervised Visual Reinforcement Learning with Object-centric Representations

Autonomous agents need large repertoires of skills to act reasonably on new tasks that they have not seen before. However, acquiring these skills using only a stream of high-dimensional, unstructured, and unlabeled observations is a tricky…

Machine Learning · Computer Science 2021-02-09 Andrii Zadaianchuk , Maximilian Seitzer , Georg Martius

DMC-VB: A Benchmark for Representation Learning for Control with Visual Distractors

Learning from previously collected data via behavioral cloning or offline reinforcement learning (RL) is a powerful recipe for scaling generalist agents by avoiding the need for expensive online learning. Despite strong generalization in…

Machine Learning · Computer Science 2024-09-30 Joseph Ortiz , Antoine Dedieu , Wolfgang Lehrach , Swaroop Guntupalli , Carter Wendelken , Ahmad Humayun , Guangyao Zhou , Sivaramakrishnan Swaminathan , Miguel Lázaro-Gredilla , Kevin Murphy

Universal Semi-supervised Model Adaptation via Collaborative Consistency Training

In this paper, we introduce a realistic and challenging domain adaptation problem called Universal Semi-supervised Model Adaptation (USMA), which i) requires only a pre-trained source model, ii) allows the source and target domain to have…

Computer Vision and Pattern Recognition · Computer Science 2023-11-06 Zizheng Yan , Yushuang Wu , Yipeng Qin , Xiaoguang Han , Shuguang Cui , Guanbin Li

DefMamba: Deformable Visual State Space Model

Recently, state space models (SSM), particularly Mamba, have attracted significant attention from scholars due to their ability to effectively balance computational efficiency and performance. However, most existing visual Mamba methods…

Computer Vision and Pattern Recognition · Computer Science 2025-04-09 Leiye Liu , Miao Zhang , Jihao Yin , Tingwei Liu , Wei Ji , Yongri Piao , Huchuan Lu

Object-Centric World Models for Causality-Aware Reinforcement Learning

World models have been developed to support sample-efficient deep reinforcement learning agents. However, it remains challenging for world models to accurately replicate environments that are high-dimensional, non-stationary, and composed…

Machine Learning · Computer Science 2026-03-31 Yosuke Nishimoto , Takashi Matsubara

Fairness-aware Vision Transformer via Debiased Self-Attention

Vision Transformer (ViT) has recently gained significant attention in solving computer vision (CV) problems due to its capability of extracting informative features and modeling long-range dependencies through the attention mechanism.…

Computer Vision and Pattern Recognition · Computer Science 2024-07-12 Yao Qiang , Chengyin Li , Prashant Khanduri , Dongxiao Zhu