机器人学
Vision-Language-Action (VLA) models have advanced robotic manipulation by combining vision, language, and proprioception to predict actions. However, previous methods fuse proprioceptive signals directly with vision-language features,…
Dexterous grasp synthesis must jointly satisfy functional intent and physical feasibility, yet existing pipelines often decouple semantic grounding from refinement, yielding unstable or non-functional contacts under object and pose…
Visual-language reasoning, driving knowledge, and value alignment are essential for advanced autonomous driving systems. However, existing approaches largely rely on data-driven learning, making it difficult to capture the complex logic…
When tasking robots in partially observable environments, these robots must efficiently and robustly plan to achieve task goals under uncertainty. Although many probabilistic planning algorithms exist for this purpose, these algorithms can…
Temporal awareness plays a central role in intelligent behavior by shaping how actions are paced, coordinated, and adapted to changing goals and environments. In contrast, most robot learning algorithms treat time only as a fixed episode…
Pre-trained robot policies serve as the foundation of many validated robotic systems, which encapsulate extensive embodied knowledge. However, they often lack the semantic awareness characteristic of foundation models, and replacing them…
Underwater multi-robot cooperative coverage remains challenging due to partial observability, limited communication, environmental uncertainty, and the lack of access to global localization. To address these issues, this paper presents a…
Real-world robotic systems frequently require diverse end-effectors for different tasks, however most existing grasp detection methods are optimized for a single gripper type, demanding retraining or optimization for each novel gripper…
Comprehensive visual, geometric, and semantic understanding of a 3D scene is crucial for successful execution of robotic tasks, especially in unstructured and complex environments. Additionally, to make robust decisions, it is necessary for…
Implicit representations have been widely applied in robotics for obstacle avoidance and path planning. In this paper, we explore the problem of constructing an implicit distance representation from a single image. Past methods for implicit…
Accurate knowledge of the tire-road friction coefficient (TRFC) is essential for vehicle safety, stability, and performance, especially in autonomous racing, where vehicles often operate at the friction limit. However, TRFC cannot be…
Deploying visual reinforcement learning (RL) policies in real-world manipulation is often hindered by camera viewpoint changes. A policy trained from a fixed front-facing camera may fail when the camera is shifted -- an unavoidable…
Wheel-legged robots combine the advantages of both wheeled robots and legged robots, offering versatile locomotion capabilities with excellent stability on challenging terrains and high efficiency on flat surfaces. However, existing…
Soft continuum arms (SCAs) soft and deformable nature presents challenges in modeling and control due to their infinite degrees of freedom and non-linear behavior. This work introduces a reinforcement learning (RL)-based framework for…
We present Generative Predictive Control (GPC), an inference-time method for improving pretrained behavior-cloning policies without retraining. GPC augments a frozen diffusion policy at deployment with an action-conditioned world model…
Diffusion policies have shown to be very efficient at learning complex, multi-modal behaviors for robotic manipulation. However, errors in generated action sequences can compound over time which can potentially lead to failure. Some…
The increasing demand for accelerated scientific discovery, driven by global challenges, highlights the need for advanced AI-driven robotics. Deploying robotic chemists in human-centric labs is key for the next horizon of autonomous…
Deep Reinforcement learning (DRL) has achieved remarkable success in domains with well-defined reward structures, such as Atari games and locomotion. In contrast, dexterous manipulation lacks general-purpose reward formulations and…
Separating thin, flexible layers that must be individually grasped is a common but challenging manipulation primitive for most off-the-shelf grippers. A prominent example arises in clinical settings: the opening of sterile flat pouches for…
Recent advancements in integrating tactile sensing into vision-language-action (VLA) models have demonstrated transformative potential for robotic perception. However, existing tactile representations predominantly rely on qualitative…