Related papers: Layout Generation Agents with Large Language Model…

LLM-mediated Dynamic Plan Generation with a Multi-Agent Approach

Planning methods with high adaptability to dynamic environments are crucial for the development of autonomous and versatile robots. We propose a method for leveraging a large language model (GPT-4o) to automatically generate networks…

Artificial Intelligence · Computer Science 2025-04-03 Reo Abe , Akifumi Ito , Kanata Takayasu , Satoshi Kurihara

LayoutGPT: Compositional Visual Planning and Generation with Large Language Models

Attaining a high degree of user controllability in visual generation often requires intricate, fine-grained inputs like layouts. However, such inputs impose a substantial burden on users when compared to simple text inputs. To address the…

Computer Vision and Pattern Recognition · Computer Science 2023-10-31 Weixi Feng , Wanrong Zhu , Tsu-jui Fu , Varun Jampani , Arjun Akula , Xuehai He , Sugato Basu , Xin Eric Wang , William Yang Wang

3D-GPT: Procedural 3D Modeling with Large Language Models

In the pursuit of efficient automated content creation, procedural generation, leveraging modifiable parameters and rule-based systems, emerges as a promising approach. Nonetheless, it could be a demanding endeavor, given its intricate…

Computer Vision and Pattern Recognition · Computer Science 2024-05-30 Chunyi Sun , Junlin Han , Weijian Deng , Xinlong Wang , Zishan Qin , Stephen Gould

Large Language Model-Enabled Multi-Agent Manufacturing Systems

Traditional manufacturing faces challenges adapting to dynamic environments and quickly responding to manufacturing changes. The use of multi-agent systems has improved adaptability and coordination but requires further advancements in…

Multiagent Systems · Computer Science 2024-06-24 Jonghan Lim , Birgit Vogel-Heuser , Ilya Kovalenko

Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule

Vision-and-language navigation (VLN) is a task in which an agent is embodied in a realistic 3D environment and follows an instruction to reach the goal node. While most of the previous studies have built and investigated a discriminative…

Computation and Language · Computer Science 2020-10-09 Shuhei Kurita , Kyunghyun Cho

Chat2Layout: Interactive 3D Furniture Layout with a Multimodal LLM

Automatic furniture layout is long desired for convenient interior design. Leveraging the remarkable visual reasoning capabilities of multimodal large language models (MLLMs), recent methods address layout generation in a static manner,…

Computer Vision and Pattern Recognition · Computer Science 2024-08-01 Can Wang , Hongliang Zhong , Menglei Chai , Mingming He , Dongdong Chen , Jing Liao

Exploring GPT-4 for Robotic Agent Strategy with Real-Time State Feedback and a Reactive Behaviour Framework

We explore the use of GPT-4 on a humanoid robot in simulation and the real world as proof of concept of a novel large language model (LLM) driven behaviour method. LLMs have shown the ability to perform various tasks, including robotic…

Robotics · Computer Science 2025-04-01 Thomas O'Brien , Ysobel Sims

Exploring Large Language Models to Facilitate Variable Autonomy for Human-Robot Teaming

In a rapidly evolving digital landscape autonomous tools and robots are becoming commonplace. Recognizing the significance of this development, this paper explores the integration of Large Language Models (LLMs) like Generative pre-trained…

Human-Computer Interaction · Computer Science 2024-03-22 Younes Lakhnati , Max Pascher , Jens Gerken

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

The advancement of large language models (LLMs) prompts the development of multi-modal agents, which are used as a controller to call external tools, providing a feasible way to solve practical tasks. In this paper, we propose a multi-modal…

Artificial Intelligence · Computer Science 2025-02-04 Zhi Gao , Bofei Zhang , Pengxiang Li , Xiaojian Ma , Tao Yuan , Yue Fan , Yuwei Wu , Yunde Jia , Song-Chun Zhu , Qing Li

"You tell me": A Dataset of GPT-4-Based Behaviour Change Support Conversations

Conversational agents are increasingly used to address emotional needs on top of information needs. One use case of increasing interest are counselling-style mental health and behaviour change interventions, with large language model…

Human-Computer Interaction · Computer Science 2026-04-22 Selina Meyer , David Elsweiler

Towards Language-guided Interactive 3D Generation: LLMs as Layout Interpreter with Generative Feedback

Generating and editing a 3D scene guided by natural language poses a challenge, primarily due to the complexity of specifying the positional relations and volumetric changes within the 3D space. Recent advancements in Large Language Models…

Computer Vision and Pattern Recognition · Computer Science 2023-05-26 Yiqi Lin , Hao Wu , Ruichen Wang , Haonan Lu , Xiaodong Lin , Hui Xiong , Lin Wang

LayoutAgent: A Vision-Language Agent Guided Compositional Diffusion for Spatial Layout Planning

Designing realistic multi-object scenes requires not only generating images, but also planning spatial layouts that respect semantic relations and physical plausibility. On one hand, while recent advances in diffusion models have enabled…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Zezhong Fan , Xiaohan Li , Luyi Ma , Kai Zhao , Liang Peng , Topojoy Biswas , Evren Korpeoglu , Kaushiki Nag , Kannan Achan

Automatic Generation of Constrained Furniture Layouts

Efficient authoring of vast virtual environments hinges on algorithms that are able to automatically generate content while also being controllable. We propose a method to automatically generate furniture layouts for indoor environments.…

Computer Vision and Pattern Recognition · Computer Science 2019-01-28 Paul Henderson , Kartic Subr , Vittorio Ferrari

Large Language Models for Virtual Human Gesture Selection

Co-speech gestures convey a wide variety of meanings and play an important role in face-to-face human interactions. These gestures significantly influence the addressee's engagement, recall, comprehension, and attitudes toward the speaker.…

Human-Computer Interaction · Computer Science 2025-03-19 Parisa Ghanad Torshizi , Laura B. Hensel , Ari Shapiro , Stacy C. Marsella

Large Language Models for Robotics: Opportunities, Challenges, and Perspectives

Large language models (LLMs) have undergone significant expansion and have been increasingly integrated across various domains. Notably, in the realm of robot task planning, LLMs harness their advanced reasoning and language comprehension…

Robotics · Computer Science 2024-01-10 Jiaqi Wang , Zihao Wu , Yiwei Li , Hanqi Jiang , Peng Shu , Enze Shi , Huawen Hu , Chong Ma , Yiheng Liu , Xuhui Wang , Yincheng Yao , Xuan Liu , Huaqin Zhao , Zhengliang Liu , Haixing Dai , Lin Zhao , Bao Ge , Xiang Li , Tianming Liu , Shu Zhang

RoboLayout: Differentiable 3D Scene Generation for Embodied Agents

Recent advances in vision language models (VLMs) have shown strong potential for spatial reasoning and 3D scene layout generation from open-ended language instructions. However, generating layouts that are not only semantically coherent but…

Artificial Intelligence · Computer Science 2026-03-10 Ali Shamsaddinlou

Generative AI-Based Virtual Assistant using Retrieval-Augmented Generation: An evaluation study for bachelor projects

Large Language Models have been increasingly employed in the creation of Virtual Assistants due to their ability to generate human-like text and handle complex inquiries. While these models hold great promise, challenges such as…

Computation and Language · Computer Science 2026-04-30 Dumitru Verşebeniuc , Martijn Elands , Sara Falahatkar , Chiara Magrone , Mohammad Falah , Martijn Boussé , Aki Härmä

AutoGenesisAgent: Self-Generating Multi-Agent Systems for Complex Tasks

The proliferation of large language models (LLMs) and their integration into multi-agent systems has paved the way for sophisticated automation in various domains. This paper introduces AutoGenesisAgent, a multi-agent system that…

Multiagent Systems · Computer Science 2024-04-29 Jeremy Harper

A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor

As customer demand for multi-variety and small-batch production increases, dynamic disturbances place greater demands on manufacturing systems. To address such challenges, researchers proposed the multi-agent manufacturing system. However,…

Artificial Intelligence · Computer Science 2025-09-23 Zhen Zhao , Dunbing Tang , Changchun Liu , Liping Wang , Zequn Zhang , Haihua Zhu , Kai Chen , Qingwei Nie , Yuchen Ji

Text2VR: Automated instruction Generation in Virtual Reality using Large language Models for Assembly Task

Virtual Reality (VR) has emerged as a powerful tool for workforce training, offering immersive, interactive, and risk-free environments that enhance skill acquisition, decision-making, and confidence. Despite its advantages, developing VR…

Computer Vision and Pattern Recognition · Computer Science 2025-08-07 Subin Raj Peter