机器人学 — Scifaro

HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness

Vision-Language-Action (VLA) Models have become the mainstream solution for robot control, but suffer from slow inference speeds. Speculative Decoding (SD) is a promising acceleration method which can be divided into two categories:…

机器人学 · 计算机科学 2026-04-28 Zihao Zheng , Zhihao Mao , Sicheng Tian , Maoliang Li , Jiayu Chen , Xinhao Sun , Zhaobo Zhang , Xuanzhe Liu , Donggang Cao , Hong Mei , Xiang Chen

Safety-aware Goal-oriented Semantic Sensing, Communication, and Control for Robotics

Wirelessly-connected robotic systems empower robots with real-time intelligence by leveraging remote computing resources for decision-making. However, the data exchange between robots and edge servers often overwhelms communication links,…

机器人学 · 计算机科学 2026-04-28 Wenchao Wu , Shutong Chen , Wenjie Liu , Zhibo Pang , Yansha Deng , Robert Schober

Robust Cooperative Localization in Featureless Environments: A Comparative Study of DCL, StCL, CCL, CI, and Standard-CL

Cooperative localization (CL) enables accurate position estimation in multi-robot systems operating in GPS-denied environments. This paper presents a comparative study of five CL approaches: Centralized Cooperative Localization (CCL),…

机器人学 · 计算机科学 2026-04-28 Nivand Khosravi , Rodrigo Ventura , Meysam Basiri

KERV: Kinematic-Rectified Speculative Decoding for Embodied VLA Models

Vision-Language-Action (VLA) models build a token-domain robot control paradigm, yet suffer from low speed. Speculative Decoding (SD) is an optimization strategy that can boost inference speed. Two key issues emerge when integrating VLA and…

机器人学 · 计算机科学 2026-04-28 Zihao Zheng , Zhihao Mao , Maoliang Li , Jiayu Chen , Xinhao Sun , Zhaobo Zhang , Donggang Cao , Hong Mei , Xiang Chen

Muscle Coactivation in the Sky: Geometry and Pareto Optimality of Energy vs. Aerodynamic Promptness and Multirotors as Variable Stiffness Actuators

In robotics and biomechanics, trading metabolic cost for kinematic readiness is a well-established principle. This paper formalizes this concept for aerial multirotors through the introduction of aerodynamic promptness -- a dynamic metric…

机器人学 · 计算机科学 2026-04-28 Antonio Franchi

INHerit-SG: Incremental Hierarchical Semantic Scene Graphs with RAG-Style Retrieval

Driven by recent advancements in foundation models, semantic scene graphs have emerged as a promising paradigm for high-level 3D environmental abstraction in robot navigation. However, existing frameworks struggle to successfully handle…

机器人学 · 计算机科学 2026-04-28 YukTungSamuel Fang , Zhikang Shi , Jiabin Qiu , Zixuan Chen , Jieqi Shi , Hao Xu , Jing Huo , Yang Gao

DextER: Language-driven Dexterous Grasp Generation with Embodied Reasoning

Language-driven dexterous grasp generation requires the models to understand task semantics, 3D geometry, and complex hand-object interactions. While vision-language models have been applied to this problem, existing approaches directly map…

机器人学 · 计算机科学 2026-04-28 Junha Lee , Eunha Park , Minsu Cho

NanoCockpit: Performance-optimized Application Framework for AI-based Autonomous Nanorobotics

Autonomous nano-drones, powered by vision-based tiny machine learning (TinyML) models, are a novel technology gaining momentum thanks to their broad applicability and pushing scientific advancement on resource-limited embedded systems.…

机器人学 · 计算机科学 2026-04-28 Elia Cereda , Alessandro Giusti , Daniele Palossi

Neuro-Symbolic Control with Large Language Models for Language-Guided Spatial Tasks

Although large language models (LLMs) have recently become effective tools for language-conditioned control in embodied systems, instability, slow convergence, and hallucinated actions continue to limit their direct application to…

机器人学 · 计算机科学 2026-04-28 Momina Liaqat Ali , Muhammad Abid , Muhammad Saqlain , Jose M. Merigo

One-Shot Real-World Demonstration Synthesis for Scalable Bimanual Manipulation

Learning dexterous bimanual manipulation policies critically depends on large-scale, high-quality demonstrations, yet current paradigms face inherent trade-offs: teleoperation provides physically grounded data but is prohibitively…

机器人学 · 计算机科学 2026-04-28 Huayi Zhou , Kui Jia

ESPADA: Execution Speedup via Semantics Aware Demonstration Data Downsampling for Imitation Learning

Behavior-cloning based visuomotor policies enable precise manipulation but often inherit the slow, cautious tempo of human demonstrations, limiting practical deployment. However, prior studies on acceleration methods mainly rely on…

机器人学 · 计算机科学 2026-04-28 Byung-ju Kim , Jinu Pahk , Chungwoo Lee , Jaejoon Kim , Jangha Lee , Theo Taeyeong Kim , Kyuhwan Shim , Jun Ki Lee , Byoung-Tak Zhang

SPEAR-1: Scaling Beyond Robot Demonstrations via 3D Understanding

Robotic Foundation Models (RFMs) hold great promise as generalist, end-to-end systems for robot control. Yet their ability to generalize across new environments, tasks, and embodiments remains limited. We argue that a major bottleneck lies…

机器人学 · 计算机科学 2026-04-28 Nikolay Nikolov , Giuliano Albanese , Sombit Dey , Aleksandar Yanev , Luc Van Gool , Jan-Nico Zaech , Danda Pani Paudel

EL3DD: Extended Latent 3D Diffusion for Language Conditioned Multitask Manipulation

Acting in human environments is a crucial capability for general-purpose robots, necessitating a robust understanding of natural language and its application to physical tasks. This paper seeks to harness the capabilities of diffusion…

机器人学 · 计算机科学 2026-04-28 Jonas Bode , Raphael Memmesheimer , Sven Behnke

Humanoid Whole-Body Badminton via Multi-Stage Reinforcement Learning

Humanoid robots have demonstrated strong capabilities for interacting with static scenes across locomotion and manipulation, yet dynamic real-world interactions remain challenging. As a step toward fast-moving object interactions, we…

机器人学 · 计算机科学 2026-04-28 Chenhao Liu , Leyun Jiang , Yibo Wang , Kairan Yao , Jinchen Fu , Xiaoyu Ren

Using Language Models as Closed-Loop High-Level Planners for Robotics Applications: A Brief Overview and Benchmarks

Large Language Models (LLMs) and Vision Language Models (VLMs) have become popular tools for embodied high-level planning. However, their deployment in black-box settings often leads to unpredictable or costly errors. To harness their…

机器人学 · 计算机科学 2026-04-28 Hao Wang , Sathwik Karnik , Bea Lim , Somil Bansal

SARM: Stage-Aware Reward Modeling for Long Horizon Robot Manipulation

Large-scale robot learning has made progress on complex manipulation tasks, yet long horizon, contact rich problems, especially those involving deformable objects, remain challenging due to inconsistent demonstration quality. We propose a…

机器人学 · 计算机科学 2026-04-28 Qianzhong Chen , Justin Yu , Mac Schwager , Pieter Abbeel , Yide Shentu , Philipp Wu

World-Env: Leveraging World Model as a Virtual Environment for VLA Post-Training

Vision-Language-Action (VLA) models trained via imitation learning suffer from significant performance degradation in data-scarce scenarios due to their reliance on large-scale demonstration datasets. Although reinforcement learning…

机器人学 · 计算机科学 2026-04-28 Junjin Xiao , Yandan Yang , Xinyuan Chang , Ronghan Chen , Feng Xiong , Mu Xu , Wei-Shi Zheng , Qing Zhang

Aegis: Automated Error Generation and Attribution for Multi-Agent Systems

Large language model based multi-agent systems (MAS) have unlocked significant advancements in tackling complex problems, but their increasing capability introduces a structural fragility that makes them difficult to debug. A key obstacle…

机器人学 · 计算机科学 2026-04-28 Fanqi Kong , Ruijie Zhang , Huaxiao Yin , Guibin Zhang , Xiaofei Zhang , Ziang Chen , Zhaowei Zhang , Xiaoyuan Zhang , Song-Chun Zhu , Xue Feng

Developing a Robotic Surgery Training System for Wide Accessibility and Research

Robotic surgery represents a major breakthrough in medical interventions, which has revolutionized surgical procedures. However, the high cost and limited accessibility of robotic surgery systems pose significant challenges for training…

机器人学 · 计算机科学 2026-04-28 Walid Shaker , Mustafa Suphi Erden

DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment

This paper presents DriVerse, a generative model for simulating navigation-driven driving scenes from a single image and a future trajectory. Previous autonomous driving world models either directly feed the trajectory or discrete control…

机器人学 · 计算机科学 2026-04-28 Xiaofan Li , Chenming Wu , Zhao Yang , Zhihao Xu , Dingkang Liang , Yumeng Zhang , Ji Wan , Jun Wang