English
Related papers

Related papers: Embodied Executable Policy Learning with Language-…

200 papers

Large Language Models (LLMs) trained using massive text datasets have recently shown promise in generating action plans for robotic agents from high level text queries. However, these models typically do not consider the robot's…

Robotics · Computer Science 2023-05-03 Maitrey Gramopadhye , Daniel Szafir

We show that large language models (LLMs) can be adapted to be generalizable policies for embodied visual tasks. Our approach, called Large LAnguage model Reinforcement Learning Policy (LLaRP), adapts a pre-trained frozen LLM to take as…

Automatically generating training supervision for embodied tasks is crucial, as manual designing is tedious and not scalable. While prior works use large language models (LLMs) or vision-language models (VLMs) to generate rewards, these…

Computer Vision and Pattern Recognition · Computer Science 2025-03-14 Xiaowen Qiu , Yian Wang , Jiting Cai , Zhehuan Chen , Chunru Lin , Tsun-Hsuan Wang , Chuang Gan

This thesis introduces "Embodied Spatial Intelligence" to address the challenge of creating robots that can perceive and act in the real world based on natural language instructions. To bridge the gap between Large Language Models (LLMs)…

Robotics · Computer Science 2025-09-03 Jiading Fang

Vision-Language Models (VLMs) have recently demonstrated strong capabilities in mapping multimodal observations to robot behaviors. However, most current approaches rely on end-to-end visuomotor policies that remain opaque and difficult to…

Robotics · Computer Science 2026-05-18 Alessandro Adami , Tommaso Tubaldo , Marco Todescato , Ruggero Carli , Pietro Falco

Large Language Models (LLMs) present a promising frontier in robotic task planning by leveraging extensive human knowledge. Nevertheless, the current literature often overlooks the critical aspects of robots' adaptability and error…

Robotics · Computer Science 2024-11-27 Sthithpragya Gupta , Kunpeng Yao , Loïc Niederhauser , Aude Billard

The recent development of Video-based Large Language Models (VideoLLMs), has significantly advanced video summarization by aligning video features and, in some cases, audio features with Large Language Models (LLMs). Each of these VideoLLMs…

Computer Vision and Pattern Recognition · Computer Science 2024-10-08 Kuan-Chen Mu , Zhi-Yi Chin , Wei-Chen Chiu

Explaining reinforcement learning agents is challenging because policies emerge from complex reward structures and neural representations that are difficult for humans to interpret. Existing approaches often rely on curated demonstrations…

Machine Learning · Computer Science 2026-01-09 Sahar Admoni , Assaf Hallak , Yftah Ziser , Omer Ben-Porat , Ofra Amir

Recent advances in large language models (LLMs) have enabled the automatic generation of executable code for task planning and control in embodied agents such as robots, demonstrating the potential of LLM-based embodied intelligence.…

Artificial Intelligence · Computer Science 2025-10-27 Sanghyun Ahn , Wonje Choi , Junyong Lee , Jinwoo Park , Honguk Woo

Large Language Models (LLMs) have gained popularity in task planning for long-horizon manipulation tasks. To enhance the validity of LLM-generated plans, visual demonstrations and online videos have been widely employed to guide the…

Robotics · Computer Science 2025-03-12 Kejia Chen , Zheng Shen , Yue Zhang , Lingyun Chen , Fan Wu , Zhenshan Bing , Sami Haddadin , Alois Knoll

In recent years, reinforcement learning and imitation learning have shown great potential for controlling humanoid robots' motion. However, these methods typically create simulation environments and rewards for specific tasks, resulting in…

Robotics · Computer Science 2024-08-01 Jingkai Sun , Qiang Zhang , Yiqun Duan , Xiaoyang Jiang , Chong Cheng , Renjing Xu

Large Language Models (LLM) have emerged as a tool for robots to generate task plans using common sense reasoning. For the LLM to generate actionable plans, scene context must be provided, often through a map. Recent works have shifted from…

Robotics · Computer Science 2024-09-25 Mike Zhang , Kaixian Qu , Vaishakh Patil , Cesar Cadena , Marco Hutter

Accurate prediction of human behavior is crucial for AI systems to effectively support real-world applications, such as autonomous robots anticipating and assisting with human tasks. Real-world scenarios frequently present challenges such…

Human-Computer Interaction · Computer Science 2025-07-21 Kojiro Takeyama , Yimeng Liu , Misha Sra

Embodied agents designed to assist users with tasks must engage in natural language interactions, interpret instructions, execute actions, and communicate effectively to resolve issues. However, collecting large-scale, diverse datasets of…

Computation and Language · Computer Science 2024-11-01 Daniel Philipov , Vardhan Dongre , Gokhan Tur , Dilek Hakkani-Tür

Large language models (LLMs) trained on code completion have been shown to be capable of synthesizing simple Python programs from docstrings [1]. We find that these code-writing LLMs can be re-purposed to write robot policy code, given…

Robotics · Computer Science 2023-05-26 Jacky Liang , Wenlong Huang , Fei Xia , Peng Xu , Karol Hausman , Brian Ichter , Pete Florence , Andy Zeng

The field of learning analytics has made notable strides in automating the detection of complex learning processes in multimodal data. However, most advancements have focused on individualized problem-solving instead of collaborative,…

Large Language Models (LLMs) and strong vision models have enabled rapid research and development in the field of Vision-Language-Action models that enable robotic control. The main objective of these methods is to develop a generalist…

Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, yet they face significant challenges in embodied task planning scenarios that require continuous environmental understanding and action generation.…

Computation and Language · Computer Science 2025-07-01 Zhaoye Fei , Li Ji , Siyin Wang , Junhao Shi , Jingjing Gong , Xipeng Qiu

Language model (LM) pre-training is useful in many language processing tasks. But can pre-trained LMs be further leveraged for more general machine learning problems? We propose an approach for using LMs to scaffold learning and…

Replicating human-level intelligence in the execution of embodied tasks remains challenging due to the unconstrained nature of real-world environments. Novel use of large language models (LLMs) for task planning seeks to address the…

‹ Prev 1 2 3 10 Next ›