English

Training-free Task-oriented Grasp Generation

Robotics 2025-10-07 v4

Abstract

This paper presents a training-free pipeline for task-oriented grasp generation that combines pre-trained grasp generation models with vision-language models (VLMs). Unlike traditional approaches that focus solely on stable grasps, our method incorporates task-specific requirements by leveraging the semantic reasoning capabilities of VLMs. We evaluate five querying strategies, each utilizing different visual representations of candidate grasps, and demonstrate significant improvements over a baseline method in both grasp success and task compliance rates, with absolute gains of up to 36.9\% in overall success rate. Our results underline the potential of VLMs to enhance task-oriented manipulation, providing insights for future research in robotic grasping and human-robot interaction.

Keywords

Cite

@article{arxiv.2502.04873,
  title  = {Training-free Task-oriented Grasp Generation},
  author = {Jiaming Wang and Diwen Liu and Jizhuo Chen and Harold Soh},
  journal= {arXiv preprint arXiv:2502.04873},
  year   = {2025}
}

Comments

Jiaming Wang, Diwen Liu, and Jizhuo Chen contributed equally

R2 v1 2026-06-28T21:36:02.187Z