Related papers: ReactGenie: A Development Framework for Complex Mu…

MRecGen: Multimodal Appropriate Reaction Generator

Verbal and non-verbal human reaction generation is a challenging task, as different reactions could be appropriate for responding to the same behaviour. This paper proposes the first multiple and multimodal (verbal and nonverbal)…

Computer Vision and Pattern Recognition · Computer Science 2023-07-07 Jiaqi Xu , Cheng Luo , Weicheng Xie , Linlin Shen , Xiaofeng Liu , Lu Liu , Hatice Gunes , Siyang Song

ReMoGen: Real-time Human Interaction-to-Reaction Generation via Modular Learning from Diverse Data

Human behaviors in real-world environments are inherently interactive, with an individual's motion shaped by surrounding agents and the scene. Such capabilities are essential for applications in virtual avatars, interactive animation, and…

Computer Vision and Pattern Recognition · Computer Science 2026-04-02 Yaoqin Ye , Yiteng Xu , Qin Sun , Xinge Zhu , Yujing Sun , Yuexin Ma

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks. AutoGen agents are customizable, conversable, and can operate in various modes…

Artificial Intelligence · Computer Science 2023-10-05 Qingyun Wu , Gagan Bansal , Jieyu Zhang , Yiran Wu , Beibin Li , Erkang Zhu , Li Jiang , Xiaoyun Zhang , Shaokun Zhang , Jiale Liu , Ahmed Hassan Awadallah , Ryen W White , Doug Burger , Chi Wang

RA-Gen: A Controllable Code Generation Framework Using ReAct for Multi-Agent Task Execution

Code generation models based on large language models (LLMs) have gained wide adoption, but challenges remain in ensuring safety, accuracy, and controllability, especially for complex tasks. Existing methods often lack dynamic integration…

Software Engineering · Computer Science 2025-10-13 Aofan Liu , Haoxuan Li , Bin Wang , Ao Yang , Hui Li

Generative Interfaces for Language Models

Large language models (LLMs) are increasingly seen as assistants, copilots, and consultants, capable of supporting a wide range of tasks through natural conversation. However, most systems remain constrained by a linear request-response…

Computation and Language · Computer Science 2026-05-05 Jiaqi Chen , Yanzhe Zhang , Yutong Zhang , Yijia Shao , Diyi Yang

SGLang: Efficient Execution of Structured Language Model Programs

Large language models (LLMs) are increasingly used for complex tasks that require multiple generation calls, advanced prompting techniques, control flow, and structured inputs/outputs. However, efficient systems are lacking for programming…

Artificial Intelligence · Computer Science 2024-06-07 Lianmin Zheng , Liangsheng Yin , Zhiqiang Xie , Chuyue Sun , Jeff Huang , Cody Hao Yu , Shiyi Cao , Christos Kozyrakis , Ion Stoica , Joseph E. Gonzalez , Clark Barrett , Ying Sheng

A Framework for Creating Natural Language User Interfaces for Action-Based Applications

In this paper we present a framework for creating natural language interfaces to action-based applications. Our framework uses a number of reusable application-independent components, in order to reduce the effort of creating a natural…

Computation and Language · Computer Science 2007-05-23 Stephen Chong , Riccardo Pucella

PreGenie: An Agentic Framework for High-quality Visual Presentation Generation

Visual presentations are vital for effective communication. Early attempts to automate their creation using deep learning often faced issues such as poorly organized layouts, inaccurate text summarization, and a lack of image understanding,…

Machine Learning · Computer Science 2025-09-03 Xiaojie Xu , Xinli Xu , Sirui Chen , Haoyu Chen , Fan Zhang , Ying-Cong Chen

WorkflowGen:an adaptive workflow generation mechanism driven by trajectory experience

Large language model (LLM) agents often suffer from high reasoning overhead, excessive token consumption, unstable execution, and inability to reuse past experiences in complex tasks like business queries, tool use, and workflow…

Machine Learning · Computer Science 2026-04-23 Ruocan Wei , Shufeng Wang , Ziwei Shi

QueryGenie: Making LLM-Based Database Querying Transparent and Controllable

Conversational user interfaces powered by large language models (LLMs) have significantly lowered the technical barriers to database querying. However, existing tools still encounter several challenges, such as misinterpretation of user…

Human-Computer Interaction · Computer Science 2025-08-22 Longfei Chen , Shenghan Gao , Shiwei Wang , Ken Lin , Yun Wang , Quan Li

Learning with Challenges: Adaptive Difficulty-Aware Data Generation for Mobile GUI Agent Training

Large-scale, high-quality interaction trajectories are essential for advancing mobile Graphical User Interface (GUI) agents. While existing methods typically rely on labor-intensive human demonstrations or automated model exploration to…

Artificial Intelligence · Computer Science 2026-02-02 Linjia Kang , Zhimin Wang , Yongkang Zhang , Duo Wu , Jinghe Wang , Ming Ma , Haopeng Yan , Zhi Wang

Geno: A Developer Tool for Authoring Multimodal Interaction on Existing Web Applications

Supporting voice commands in applications presents significant benefits to users. However, adding such support to existing GUI-based web apps is effort-consuming with a high learning barrier, as shown in our formative study, due to the lack…

Human-Computer Interaction · Computer Science 2020-07-21 Ritam Jyoti Sarmah , Yunpeng Ding , Di Wang , Cheuk Yin Phipson Lee , Toby Jia-Jun Li , Xiang 'Anthony' Chen

AppAgent v2: Advanced Agent for Flexible Mobile Interactions

With the advancement of Multimodal Large Language Models (MLLM), LLM-driven visual agents are increasingly impacting software interfaces, particularly those with graphical user interfaces. This work introduces a novel LLM-based multimodal…

Human-Computer Interaction · Computer Science 2025-09-18 Yanda Li , Chi Zhang , Wenjia Jiang , Wanqi Yang , Bin Fu , Pei Cheng , Xin Chen , Ling Chen , Yunchao Wei

WeGen: A Unified Model for Interactive Multimodal Generation as We Chat

Existing multimodal generative models fall short as qualified design copilots, as they often struggle to generate imaginative outputs once instructions are less detailed or lack the ability to maintain consistency with the provided…

Computer Vision and Pattern Recognition · Computer Science 2025-03-11 Zhipeng Huang , Shaobin Zhuang , Canmiao Fu , Binxin Yang , Ying Zhang , Chong Sun , Zhizheng Zhang , Yali Wang , Chen Li , Zheng-Jun Zha

ChatGPT and Other Large Language Models as Evolutionary Engines for Online Interactive Collaborative Game Design

Large language models (LLMs) have taken the scientific world by storm, changing the landscape of natural language processing and human-computer interaction. These powerful tools can answer complex questions and, surprisingly, perform…

Artificial Intelligence · Computer Science 2023-11-14 Pier Luca Lanzi , Daniele Loiacono

Interactive Task Planning with Language Models

An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals and distinct tasks, even during execution. However, most traditional methods require predefined module design, making it hard to…

Robotics · Computer Science 2025-02-11 Boyi Li , Philipp Wu , Pieter Abbeel , Jitendra Malik

Dialogue-based generation of self-driving simulation scenarios using Large Language Models

Simulation is an invaluable tool for developing and evaluating controllers for self-driving cars. Current simulation frameworks are driven by highly-specialist domain specific languages, and so a natural language interface would greatly…

Artificial Intelligence · Computer Science 2023-10-27 Antonio Valerio Miceli-Barone , Alex Lascarides , Craig Innes

RunAgent: Interpreting Natural-Language Plans with Constraint-Guided Execution

Humans solve problems by executing targeted plans, yet large language models (LLMs) remain unreliable for structured workflow execution. We propose RunAgent, a multi-agent plan execution platform that interprets natural-language plans while…

Machine Learning · Computer Science 2026-05-04 Arunabh Srivastava , Mohammad A. , Khojastepour , Srimat Chakradhar , Sennur Ulukus

RestGPT: Connecting Large Language Models with Real-World RESTful APIs

Tool-augmented large language models (LLMs) have achieved remarkable progress in tackling a broad range of tasks. However, existing methods are mainly restricted to specifically designed tools and fail to fulfill complex instructions,…

Computation and Language · Computer Science 2023-08-29 Yifan Song , Weimin Xiong , Dawei Zhu , Wenhao Wu , Han Qian , Mingbo Song , Hailiang Huang , Cheng Li , Ke Wang , Rong Yao , Ye Tian , Sujian Li

ReactFace: Online Multiple Appropriate Facial Reaction Generation in Dyadic Interactions

In dyadic interaction, predicting the listener's facial reactions is challenging as different reactions could be appropriate in response to the same speaker's behaviour. Previous approaches predominantly treated this task as an…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Cheng Luo , Siyang Song , Weicheng Xie , Micol Spitale , Zongyuan Ge , Linlin Shen , Hatice Gunes