Computer Science

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

Robot manipulation critically depends on perception that preserves the action-relevant aspects of a scene. Yet most robot learning pipelines are built upon visual encoders pre-trained for static recognition or vision-language alignment,…

Robotics · Computer Science 2026-05-29 Jusuk Lee , Seungjae Lee , Jonghun Shin , Hoseong Jung , Sungha Kim , Daesol Cho , H. Jin Kim , Jia-Bin Huang , Furong Huang

RoboWits: Unexpected Challenges for Robotic Creative Problem Solving

The ability to reason, adapt, and creatively solve problems under unexpected challenges is essential for robots operating in real-world environments. However, current robotic benchmarks primarily emphasize skill-level execution and provide…

Robotics · Computer Science 2026-05-29 Chunru Lin , Hongxin Zhang , Fenghao Yu , Zhehuan Chen , Thomas L. Griffiths , Yejin Choi , David Held , Chuang Gan

A Heterogeneous Architecture for Robot RL Beyond GPU-Dominant Paradigms

Simulation-based RL for contemporary robot control is increasingly organized around GPU-resident simulation: physics, rollout collection, and learning are placed on a single GPU-centric execution path. This paradigm has greatly improved…

Robotics · Computer Science 2026-05-29 Yufei Jia , Zhanxiang Cao , Mingrui Yu , Heng Zhang , Shenyu Chen , Dixuan Jiang , Meng Li , Xiaofan Li , Yiyang Liu , Junzhe Wu , Zheng Li , XiLin Fang , Tingyu Cui , Shengcheng Fu , Haoyang Li , Anqi Wang , Zifan Wang , Dongjie Zhu , Chenyu Cao , Zhenbiao Huang , Ziang Zheng , Jie Lu , Xin Ma , Zhengyang Wei , Xiang Zhao , Tianyue Zhan , Ye He , Yuxiang Chen , Yizhou Jiang , Yue Li , Haizhou Ge , Yuhang Dong , Fan Jia , Ziheng Zhang , Meng Zhang , Xiwa Deng , Zhixing Chen , Hanyang Shao , Chenxin Dong , Yixuan Li , Yizhi Chen , Bokui Chen , Kaifeng Zhang , Hanqing Cui , Yusen Qin , Ruqi Huang , Lei Han , Tiancai Wang , Xiang Li , Yue Gao , Guyue Zhou

RAFI -- A Ray/Work Forwarding Infrastructure for Data Parallel Multi-Node/Multi-GPU Computing

We present RaFI, a CUDA and MPI based software framework that simplifies the task of building GPU-enabled data-parallel software where rays or similar work items need to migrate between different GPUs. RaFI provides a simple interface for…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-29 Ingo Wald , Serkan Demirci , Alper Sahistan , Stefan Zellmann , Andrea Paris , Patrick Moran , Milan Jaros , Tatiana von Landesberger , Ugur Gudukbay , Valerio Pascucci

Gaze2Act: Gaze-Conditioned Vision-Language-Action Policies for Interactive Robot Manipulation

Vision-Language-Action (VLA) models have recently shown strong potential for robot learning by following language instructions. However, in practice, language alone is often insufficient to precisely convey human intent. It is difficult to…

Robotics · Computer Science 2026-05-29 Kuangji Zuo , Gen Li , Bofan Lyu , Yanshuo Lu , Boyu Ma , Shijia Han , Xinyu Zhou , Xichen Yuan , Chuhao Zhou , Jiaqi Bai , Geng Li , Jianfei Yang

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Embodied intelligence is often studied through specialized models for individual tasks such as manipulation or navigation, resulting in fragmented capabilities and limited generalization across tasks, environments, and robot embodiments. In…

Robotics · Computer Science 2026-05-29 Qiuyue Wang , Mingsheng Li , Jian Guan , Jinhui Ye , Sicheng Xie , Yitao Liu , Junhao Chen , Zhixuan Liang , Jie Zhang , Xintong Hu , Xuhong Huang , Pei Lin , Junyang Lin , Dayiheng Liu , Shuai Bai , Jingren Zhou , Jiazhao Zhang , Haoqi Yuan , Gengze Zhou , Hang Yin , Ye Wang , Yiyang Huang , Zixing Lei , Wujian Peng , Delin Chen , Yingming Zheng , Jingyang Fan , Xianwei Zhuang , Xin Zhou , Haoyang Li , Anzhe Chen , Tong Zhang , Xuejing Liu , Yuchong Sun , Ruizhe Chen , Zhaohai Li , Chenxu Lü , Zhibo Yang , Tao Yu , Xionghui Chen

BORA: Bridging Offline Reinforcement Learning and Online Residual Adaptation for Real-World Dexterous VLA Models

Vision-Language-Action (VLA) models have emerged as a promising paradigm for grounding visual-language understanding into real-world robotic manipulation. However, dexterous manipulation remains challenging for VLA policies due to…

Robotics · Computer Science 2026-05-29 Zhongxi Chen , Yifan Han , Yanming Shao , Huanming Liu , Congsheng Xu , Xiaoyu Chen , Yao Mu , Wenzhao Lian

Unveiling the Visual Counting Bottleneck in Vision-Language Models

While Large Vision-Language Models (VLMs) excel at interpolation, they suffer catastrophic failures in systematic generalization, most notably in visual counting. In this work, we investigate this extrapolation bottleneck by deconstructing…

Multimedia · Computer Science 2026-05-29 Xingzhou Pang , Yifan Hou , Junling Wang , Mrinmaya Sachan

Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance

Recent advances in reinforcement learning (RL) have achieved great successes by leveraging the multimodality and exploration capability of diffusion policies. Among these approaches, one representative branch focuses on the sampling-based…

Robotics · Computer Science 2026-05-29 Shutong Ding , Zejia Zhong , Zhongyi Wang , Ke Hu , Bikang Pan , Jingya Wang , Ye Shi

Replicable Simulation-Based Robot Validation through Provenance

Robot behavior is often validated through simulation-based testing, yet the replicability of such campaigns depends critically on transparent documentation of how tests are configured, executed, and post-processed. We argue that data…

Robotics · Computer Science 2026-05-29 Argentina Ortega , Samuel Wiest , Frederik Pasch , Nico Hochgeschwender

Effective MPI: User-defined Datatypes and Cartesian Communicators for Zero-copy All-to-all Communication in Multidimensional Tori

We present and show how to implement a non-trivial all-to-all communication algorithm for arbitrary $d$-dimensional tori effectively in MPI. Given a factorization of the number of processes $p$ into $d$ factors that can be mapped onto a…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-29 Jesper Larsson Träff

Fisher-Preserving Guidance: Training-Free Manifold Constraints for Safe Diffusion Control

Diffusion models are effective for waypoint prediction in visual navigation, but standard sampling and test time guidance can produce unreliable or inefficient trajectories when updates drift off the training manifold. We propose Fisher…

Robotics · Computer Science 2026-05-29 Hao Ren , Zetong Bi , Yiming Zeng , Le Zheng , Zhi Li , Zhaoliang Wan , Lu Qi , Hui Cheng

LLM-Guided Future Hypotheses for Horizon-Aware Exploration in Multi-Step Robot Manipulation

Multi-step robot manipulation requires acting under uncertainty about how the scene will evolve, making exploration and policy adaptation challenging. We study whether short-horizon, task-consistent future videos can provide useful…

Robotics · Computer Science 2026-05-29 Mohammad Khoshnazar , Andrew Melnik , Michael Beetz

Joint Angle Estimation with Customized Wristband Based on Online Incremental Learning

Intelligent wearable technology plays an increasingly important role in human-computer interaction, motion, and health monitoring. To ensure comfort and practicality of use, one common form for motion monitoring is to utilize soft wearable…

Robotics · Computer Science 2026-05-29 Shuo Wang , Xiaobin Chen , Xiaoming Tao

MARS Policy: Multimodality Only When It Matters

Imitation learning has become a cornerstone for solving complex robotic manipulation tasks. In particular, multimodality, which enables robots to capture diverse yet valid behavioral patterns, has driven the rapid emergence of generative…

Robotics · Computer Science 2026-05-29 Jindou Jia , Tuo An , Yuxuan Hu , Gen Li , Jingliang Li , Bohan Hou , Xiangyu Chen , Jiaqi Bai , Bofan Lyu , Jianfei Yang

CARM Tool: Cache-Aware Roofline Model Automatic Benchmarking and Application Analysis

In recent years, HPC systems and CPU architectures as their central components, have become increasingly complex, making application development and optimization quite challenging. In this respect, intuitive performance models like the…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-29 José Morgado , Leonel Sousa , Aleksandar Ilic

PRISM: Processing-In-Memory Sparse MTTKRP for Tensor Decomposition Acceleration

Sparse tensors are the most used representation of sparse multidimensional data. Operations that decompose them, selecting their most important features while reducing their dimension, have become prevalent procedures in machine learning.…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-29 Daniel Pacheco , Leonel Sousa , Aleksandar Ilic

PhAIL: A Real-Robot VLA Benchmark and Distributional Methodology

Real-world evaluation of vision-language-action (VLA) policies still rests on binary success rate at a fixed timeout with $N \le 25$ rollouts per condition, almost always without confidence intervals or paired statistical comparison; these…

Robotics · Computer Science 2026-05-29 Sergey Arkhangelskiy

FLIP: Real-Time and Resilient Formation Planning for Large-Scale DIstributed Swarms via Point Cloud Registration

Traditional large-scale formation planning either oversimplify the formation representation which leads to poor performance, or they employ complete collaborative relationships, which results in excessive computational load. To achieve…

Robotics · Computer Science 2026-05-29 Yuan Zhou , Guangtong Xu , Zhenyu Hou , Jialiang Hou , Fei Gao

AMDP: Asynchronous Multi-Directional Pipeline Parallelism for Large-Scale Models Training

Pipeline parallelism is essential for large-scale model training, but existing asynchronous approaches often degrade convergence due to parameter mismatch between forward and backward passes. We propose Asynchronous Multi-Directional…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-29 Ling Chen , Houming Wu , Wenjie Yu