Related papers: VLM-driven Behavior Tree for Context-aware Task Pl…

LLM-as-BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning

Robotic assembly tasks remain an open challenge due to their long horizon nature and complex part relations. Behavior trees (BTs) are increasingly used in robot task planning for their modularity and flexibility, but creating them manually…

Robotics · Computer Science 2025-06-19 Jicong Ao , Fan Wu , Yansong Wu , Abdalla Swikir , Sami Haddadin

Multimodal Behavior Tree Generation: A Small Vision-Language Model for Robot Task Planning

Large and small language models have been widely used for robotic task planning. At the same time, vision-language models (VLMs) have successfully tackled problems such as image captioning, scene understanding, and visual question…

Robotics · Computer Science 2026-03-09 Cristiano Battistini , Riccardo Andrea Izzo , Gianluca Bardaro , Matteo Matteucci

Task-oriented Robotic Manipulation with Vision Language Models

Vision Language Models (VLMs) play a crucial role in robotic manipulation by enabling robots to understand and interpret the visual properties of objects and their surroundings, allowing them to perform manipulation based on this multimodal…

Robotics · Computer Science 2025-05-21 Nurhan Bulus Guran , Hanchi Ren , Jingjing Deng , Xianghua Xie

A Study on Training and Developing Large Language Models for Behavior Tree Generation

This paper presents an innovative exploration of the application potential of large language models (LLM) in addressing the challenging task of automatically generating behavior trees (BTs) for complex tasks. The conventional manual BT…

Computation and Language · Computer Science 2024-01-17 Fu Li , Xueying Wang , Bin Li , Yunlong Wu , Yanzhen Wang , Xiaodong Yi

LLM-BT: Performing Robotic Adaptive Tasks based on Large Language Models and Behavior Trees

Large Language Models (LLMs) have been widely utilized to perform complex robotic tasks. However, handling external disturbances during tasks is still an open challenge. This paper proposes a novel method to achieve robotic adaptive tasks…

Robotics · Computer Science 2024-08-20 Haotian Zhou , Yunhan Lin , Longwu Yan , Jihong Zhu , Huasong Min

LLM-HBT: Dynamic Behavior Tree Construction for Adaptive Coordination in Heterogeneous Robots

We introduce a novel framework for automatic behavior tree (BT) construction in heterogeneous multi-robot systems, designed to address the challenges of adaptability and robustness in dynamic environments. Traditional robots are limited by…

Robotics · Computer Science 2025-10-14 Chaoran Wang , Jingyuan Sun , Yanhui Zhang , Mingyu Zhang , Changju Wu

Combining Context Awareness and Planning to Learn Behavior Trees from Demonstration

Fast changing tasks in unpredictable, collaborative environments are typical for medium-small companies, where robotised applications are increasing. Thus, robot programs should be generated in short time with small effort, and the robot…

Robotics · Computer Science 2022-03-18 Oscar Gustavsson , Matteo Iovino , Jonathan Styrud , Christian Smith

Behavior Tree Generation using Large Language Models for Sequential Manipulation Planning with Human Instructions and Feedback

In this work, we propose an LLM-based BT generation framework to leverage the strengths of both for sequential manipulation planning. To enable human-robot collaborative task planning and enhance intuitive robot programming by nonexperts,…

Robotics · Computer Science 2024-09-17 Jicong Ao , Yansong Wu , Fan Wu , Sami Haddadin

Addressing Failures in Robotics using Vision-Based Language Models (VLMs) and Behavior Trees (BT)

In this paper, we propose an approach that combines Vision Language Models (VLMs) and Behavior Trees (BTs) to address failures in robotics. Current robotic systems can handle known failures with pre-existing recovery strategies, but they…

Robotics · Computer Science 2024-11-05 Faseeh Ahmad , Jonathan Styrud , Volker Krueger

Automatic Robot Task Planning by Integrating Large Language Model with Genetic Programming

Accurate task planning is critical for controlling autonomous systems, such as robots, drones, and self-driving vehicles. Behavior Trees (BTs) are considered one of the most prominent control-policy-defining frameworks in task planning, due…

Robotics · Computer Science 2025-02-12 Azizjon Kobilov , Jianglin Lan

ReplanVLM: Replanning Robotic Tasks with Visual Language Models

Large language models (LLMs) have gained increasing popularity in robotic task planning due to their exceptional abilities in text analytics and generation, as well as their broad knowledge of the world. However, they fall short in decoding…

Robotics · Computer Science 2024-08-01 Aoran Mei , Guo-Niu Zhu , Huaxiang Zhang , Zhongxue Gan

Action Contextualization: Adaptive Task Planning and Action Tuning using Large Language Models

Large Language Models (LLMs) present a promising frontier in robotic task planning by leveraging extensive human knowledge. Nevertheless, the current literature often overlooks the critical aspects of robots' adaptability and error…

Robotics · Computer Science 2024-11-27 Sthithpragya Gupta , Kunpeng Yao , Loïc Niederhauser , Aude Billard

LEMMo-Plan: LLM-Enhanced Learning from Multi-Modal Demonstration for Planning Sequential Contact-Rich Manipulation Tasks

Large Language Models (LLMs) have gained popularity in task planning for long-horizon manipulation tasks. To enhance the validity of LLM-generated plans, visual demonstrations and online videos have been widely employed to guide the…

Robotics · Computer Science 2025-03-12 Kejia Chen , Zheng Shen , Yue Zhang , Lingyun Chen , Fan Wu , Zhenshan Bing , Sami Haddadin , Alois Knoll

WorldVLM: Combining World Model Forecasting and Vision-Language Reasoning

Autonomous driving systems depend on on models that can reason about high-level scene contexts and accurately predict the dynamics of their surrounding environment. Vision- Language Models (VLMs) have recently emerged as promising tools for…

Computer Vision and Pattern Recognition · Computer Science 2026-03-18 Stefan Englmeier , Katharina Winter , Fabian B. Flohr

Seeing is Believing (and Predicting): Context-Aware Multi-Human Behavior Prediction with Vision Language Models

Accurately predicting human behaviors is crucial for mobile robots operating in human-populated environments. While prior research primarily focuses on predicting actions in single-human scenarios from an egocentric view, several robotic…

Computer Vision and Pattern Recognition · Computer Science 2025-12-19 Utsav Panchal , Yuchen Liu , Luigi Palmieri , Ilche Georgievski , Marco Aiello

Vision-Language-Policy Model for Dynamic Robot Task Planning

Bridging the gap between natural language commands and autonomous execution in unstructured environments remains an open challenge for robotics. This requires robots to perceive and reason over the current task scene through multiple…

Robotics · Computer Science 2025-12-23 Jin Wang , Kim Tien Ly , Jacques Cloete , Nikos Tsagarakis , Ioannis Havoutis

Automatic Behavior Tree Expansion with LLMs for Robotic Manipulation

Robotic systems for manipulation tasks are increasingly expected to be easy to configure for new tasks or unpredictable environments, while keeping a transparent policy that is readable and verifiable by humans. We propose the method…

Robotics · Computer Science 2024-09-23 Jonathan Styrud , Matteo Iovino , Mikael Norrlöf , Mårten Björkman , Christian Smith

LLM-BRAIn: AI-driven Fast Generation of Robot Behaviour Tree based on Large Language Model

This paper presents a novel approach in autonomous robot control, named LLM-BRAIn, that makes possible robot behavior generation, based on operator's commands. LLM-BRAIn is a transformer-based Large Language Model (LLM) fine-tuned from…

Robotics · Computer Science 2023-06-01 Artem Lykov , Dzmitry Tsetserukou

VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making

Recent research looks to harness the general knowledge and reasoning of large language models (LLMs) into agents that accomplish user-specified goals in interactive environments. Vision-language models (VLMs) extend LLMs to multi-modal data…

Machine Learning · Computer Science 2025-05-07 Jake Grigsby , Yuke Zhu , Michael Ryoo , Juan Carlos Niebles

Generating Executable Action Plans with Environmentally-Aware Language Models

Large Language Models (LLMs) trained using massive text datasets have recently shown promise in generating action plans for robotic agents from high level text queries. However, these models typically do not consider the robot's…

Robotics · Computer Science 2023-05-03 Maitrey Gramopadhye , Daniel Szafir