Max Argus — Scifaro

MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation

A prevailing view in robot learning is that simulation alone is not enough; effective sim-to-real transfer is widely believed to require at least some real-world data collection or task-specific fine-tuning to bridge the gap between…

Robotics · Computer Science 2026-03-27 Abhay Deshpande , Maya Guru , Rose Hendrix , Snehal Jauhri , Ainaz Eftekhar , Rohun Tripathi , Max Argus , Jordi Salvador , Haoquan Fang , Matthew Wallingford , Wilbert Pumacay , Yejin Kim , Quinn Pfeifer , Ying-Chun Lee , Piper Wolters , Omar Rayyan , Mingtong Zhang , Jiafei Duan , Karen Farley , Winson Han , Eli Vanderbilt , Dieter Fox , Ali Farhadi , Georgia Chalvatzaki , Dhruv Shah , Ranjay Krishna

MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation

Deploying robots at scale demands robustness to the long tail of everyday situations. The countless variations in scene layout, object geometry, and task specifications that characterize real environments are vast and underrepresented in…

Robotics · Computer Science 2026-02-20 Yejin Kim , Wilbert Pumacay , Omar Rayyan , Max Argus , Winson Han , Eli VanderBilt , Jordi Salvador , Abhay Deshpande , Rose Hendrix , Snehal Jauhri , Shuo Liu , Nur Muhammad Mahi Shafiullah , Maya Guru , Ainaz Eftekhar , Karen Farley , Donovan Clay , Jiafei Duan , Arjun Guru , Piper Wolters , Alvaro Herrasti , Ying-Chun Lee , Georgia Chalvatzaki , Yuchen Cui , Ali Farhadi , Dieter Fox , Ranjay Krishna

cVLA: Towards Efficient Camera-Space VLAs

Vision-Language-Action (VLA) models offer a compelling framework for tackling complex robotic manipulation tasks, but they are often expensive to train. In this paper, we propose a novel VLA approach that leverages the competitive…

Robotics · Computer Science 2025-12-23 Max Argus , Jelena Bratulic , Houman Masnavi , Maxim Velikanov , Nick Heppert , Abhinav Valada , Thomas Brox

Efficient Learning of Object Placement with Intra-Category Transfer

Efficient learning from demonstration for long-horizon tasks remains an open challenge in robotics. While significant effort has been directed toward learning trajectories, a recent resurgence of object-centric approaches has demonstrated…

Robotics · Computer Science 2025-12-01 Adrian Röfer , Russell Buchanan , Max Argus , Sethu Vijayakumar , Abhinav Valada

When and How Does CLIP Enable Domain and Compositional Generalization?

The remarkable generalization performance of contrastive vision-language models like CLIP is often attributed to the diversity of their training distributions. However, key questions remain unanswered: Can CLIP generalize to an entirely…

Machine Learning · Computer Science 2025-09-15 Elias Kempf , Simon Schrodi , Max Argus , Thomas Brox

Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models

Contrastive vision-language models (VLMs), like CLIP, have gained popularity for their versatile applicability to various downstream tasks. Despite their successes in some tasks, like zero-shot object recognition, they perform surprisingly…

Computer Vision and Pattern Recognition · Computer Science 2025-04-17 Simon Schrodi , David T. Hoffmann , Max Argus , Volker Fischer , Thomas Brox

DITTO: Demonstration Imitation by Trajectory Transformation

Teaching robots new skills quickly and conveniently is crucial for the broader adoption of robotic systems. In this work, we address the problem of one-shot imitation from a single human demonstration, given by an RGB-D video recording. We…

Robotics · Computer Science 2025-01-30 Nick Heppert , Max Argus , Tim Welschehold , Thomas Brox , Abhinav Valada

Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching

Learning from expert demonstrations is a promising approach for training robotic manipulation policies from limited data. However, imitation learning algorithms require a number of design choices ranging from the input modality, training…

Robotics · Computer Science 2024-09-12 Eugenio Chisari , Nick Heppert , Max Argus , Tim Welschehold , Thomas Brox , Abhinav Valada

Concept Bottleneck Models Without Predefined Concepts

There has been considerable recent interest in interpretable concept-based models such as Concept Bottleneck Models (CBMs), which first predict human-interpretable concepts and then map them to output classes. To reduce reliance on…

Machine Learning · Computer Science 2024-07-08 Simon Schrodi , Julian Schur , Max Argus , Thomas Brox

CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity

Sample efficiency is a crucial problem in deep reinforcement learning. Recent algorithms, such as REDQ and DroQ, found a way to improve the sample efficiency by increasing the update-to-data (UTD) ratio to 20 gradient update steps on the…

Machine Learning · Computer Science 2024-03-26 Aditya Bhatt , Daniel Palenicek , Boris Belousov , Max Argus , Artemij Amiranashvili , Thomas Brox , Jan Peters

Latent Diffusion Counterfactual Explanations

Counterfactual explanations have emerged as a promising method for elucidating the behavior of opaque black-box models. Recently, several works leveraged pixel-space diffusion models for counterfactual generation. To handle noisy,…

Machine Learning · Computer Science 2023-10-11 Karim Farid , Simon Schrodi , Max Argus , Thomas Brox

Climate-sensitive Urban Planning through Optimization of Tree Placements

Climate change is increasing the intensity and frequency of many extreme weather events, including heatwaves, which results in increased thermal discomfort and mortality rates. While global mitigation action is undoubtedly necessary, so is…

Computer Vision and Pattern Recognition · Computer Science 2023-10-10 Simon Schrodi , Ferdinand Briegel , Max Argus , Andreas Christen , Thomas Brox

Compositional Servoing by Recombining Demonstrations

Learning-based manipulation policies from image inputs often show weak task transfer capabilities. In contrast, visual servoing methods allow efficient task transfer in high-precision scenarios while requiring only a few demonstrations. In…

Robotics · Computer Science 2023-10-09 Max Argus , Abhijeet Nayak , Martin Büchner , Silvio Galesso , Abhinav Valada , Thomas Brox

Far Away in the Deep Space: Dense Nearest-Neighbor-Based Out-of-Distribution Detection

The key to out-of-distribution detection is density estimation of the in-distribution data or of its feature representations. This is particularly challenging for dense anomaly detection in domains where the in-distribution data has a…

Computer Vision and Pattern Recognition · Computer Science 2023-09-15 Silvio Galesso , Max Argus , Thomas Brox

RobotIO: A Python Library for Robot Manipulation Experiments

Setting up robot environments to quickly test newly developed algorithms is still a difficult and time consuming process. This presents a significant hurdle to researchers interested in performing real-world robotic experiments. RobotIO is…

Robotics · Computer Science 2022-08-17 Lukas Hermann , Max Argus , Adrian Roefer , Abhinav Valada , Thomas Brox

Conditional Visual Servoing for Multi-Step Tasks

Visual Servoing has been effectively used to move a robot into specific target locations or to track a recorded demonstration. It does not require manual programming, but it is typically limited to settings where one demonstration maps to…

Robotics · Computer Science 2022-05-18 Sergio Izquierdo , Max Argus , Thomas Brox

Contrastive Representation Learning for Hand Shape Estimation

This work presents improvements in monocular hand shape estimation by building on top of recent advances in unsupervised learning. We extend momentum contrastive learning and contribute a structured collection of hand images, well suited…

Computer Vision and Pattern Recognition · Computer Science 2021-07-05 Christian Zimmermann , Max Argus , Thomas Brox

Pre-training of Deep RL Agents for Improved Learning under Domain Randomization

Visual domain randomization in simulated environments is a widely used method to transfer policies trained in simulation to real robots. However, domain randomization and augmentation hamper the training of a policy. As reinforcement…

Machine Learning · Computer Science 2021-04-30 Artemij Amiranashvili , Max Argus , Lukas Hermann , Wolfram Burgard , Thomas Brox

Temporal Shift GAN for Large Scale Video Generation

Video generation models have become increasingly popular in the last few years, however the standard 2D architectures used today lack natural spatio-temporal modelling capabilities. In this paper, we present a network architecture for video…

Computer Vision and Pattern Recognition · Computer Science 2020-11-12 Andres Munoz , Mohammadreza Zolfaghari , Max Argus , Thomas Brox

Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control

We propose Adaptive Curriculum Generation from Demonstrations (ACGD) for reinforcement learning in the presence of sparse rewards. Rather than designing shaped reward functions, ACGD adaptively sets the appropriate task difficulty for the…

Robotics · Computer Science 2020-07-09 Lukas Hermann , Max Argus , Andreas Eitel , Artemij Amiranashvili , Wolfram Burgard , Thomas Brox