Related papers: Efficient LLM Serving for Agentic Workflows: A Dat…

Batch Query Processing and Optimization for Agentic Workflows

Large Language Models (LLMs) in agentic workflows combine multi-step reasoning, heterogeneous tool use, and collaboration across multiple specialized agents. Existing LLM serving engines optimize individual calls in isolation, while…

Databases · Computer Science 2026-01-21 Junyi Shen , Noppanat Wadlom , Yao Lu

Query Optimization Beyond Data Systems: The Case for Multi-Agent Systems

The proliferation of large language models (LLMs) has accelerated the adoption of agent-based workflows, where multiple autonomous agents reason, invoke functions, and collaborate to compose complex data pipelines. However, current…

Databases · Computer Science 2025-12-15 Zoi Kaoudi , Ioana Giurgiu

A Survey on Agent Workflow -- Status and Future

In the age of large language models (LLMs), autonomous agents have emerged as a powerful paradigm for achieving general intelligence. These agents dynamically leverage tools, memory, and reasoning capabilities to accomplish user-defined…

Artificial Intelligence · Computer Science 2025-08-05 Chaojia Yu , Zihan Cheng , Hanwen Cui , Yishuo Gao , Zexu Luo , Yijin Wang , Hangbin Zheng , Yong Zhao

Agentic AI Workload Characteristics

Agentic AI shifts LLM serving from isolated prompt-generation requests to stateful, multi-turn executions that repeatedly invoke the model, call tools, and grow context over time. This paper characterizes ReAct-style agents from both the…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-27 Yichao Yuan , Ankita Nayak , Souvik Kundu , Nishil Talati

Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First

Large Language Model (LLM) agents, acting on their users' behalf to manipulate and analyze data, are likely to become the dominant workload for data systems in the future. When working with data, agents employ a high-throughput process of…

Artificial Intelligence · Computer Science 2025-12-09 Shu Liu , Soujanya Ponnapalli , Shreya Shankar , Sepanta Zeighami , Alan Zhu , Shubham Agarwal , Ruiqi Chen , Samion Suwito , Shuo Yuan , Ion Stoica , Matei Zaharia , Alvin Cheung , Natacha Crooks , Joseph E. Gonzalez , Aditya G. Parameswaran

Autellix: An Efficient Serving Engine for LLM Agents as General Programs

Large language model (LLM) applications are evolving beyond simple chatbots into dynamic, general-purpose agentic programs, which scale LLM calls and output tokens to help AI agents reason, explore, and solve complex tasks. However,…

Machine Learning · Computer Science 2025-02-20 Michael Luo , Xiaoxiang Shi , Colin Cai , Tianjun Zhang , Justin Wong , Yichuan Wang , Chi Wang , Yanping Huang , Zhifeng Chen , Joseph E. Gonzalez , Ion Stoica

Toward Reliable Design of LLM-Enabled Agentic Workflows: Optimizing Latency-Reliability-Cost Tradeoffs

Modern AI systems increasingly rely on workflows composed of multiple interacting agents, some powered by large language models (LLMs) and others by conventional computational modules. This paper analyzes the fundamental tradeoffs between…

Artificial Intelligence · Computer Science 2026-05-26 Ya-Ting Yang , Quanyan Zhu

AXIS: Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents

Multimodal large language models (MLLMs) have enabled LLM-based agents to directly interact with application user interfaces (UIs), enhancing agents' performance in complex tasks. However, these agents often suffer from high latency and low…

Artificial Intelligence · Computer Science 2025-05-20 Junting Lu , Zhiyang Zhang , Fangkai Yang , Jue Zhang , Lu Wang , Chao Du , Qingwei Lin , Saravan Rajmohan , Dongmei Zhang , Qi Zhang

What Limits Agentic Systems Efficiency?

Large Language Models (LLMs), such as OpenAI-o1 and DeepSeek-R1, have demonstrated strong reasoning capabilities. To further enhance LLM capabilities, recent agentic systems, such as Deep Research, incorporate web interactions into LLM…

Artificial Intelligence · Computer Science 2025-10-21 Song Bian , Minghao Yan , Anand Jayarajan , Gennady Pekhimenko , Shivaram Venkataraman

KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows

Large language model (LLM) based agentic workflows have become a popular paradigm for coordinating multiple specialized agents to solve complex tasks. To improve serving efficiency, existing LLM systems employ prefix caching to reuse…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-11 Zaifeng Pan , Ajjkumar Patel , Zhengding Hu , Yipeng Shen , Yue Guan , Wan-Lu Li , Lianhui Qin , Yida Wang , Yufei Ding

Agentic Workflows for Economic Research: Design and Implementation

This paper introduces a methodology based on agentic workflows for economic research that leverages Large Language Models (LLMs) and multimodal AI to enhance research efficiency and reproducibility. Our approach features autonomous and…

General Economics · Economics 2025-04-15 Herbert Dawid , Philipp Harting , Hankui Wang , Zhongli Wang , Jiachen Yi

ScaleLLM: A Resource-Frugal LLM Serving Framework by Optimizing End-to-End Efficiency

Large language models (LLMs) have surged in popularity and are extensively used in commercial applications, where the efficiency of model serving is crucial for the user experience. Most current research focuses on optimizing individual…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-12 Yuhang Yao , Han Jin , Alay Dilipbhai Shah , Shanshan Han , Zijian Hu , Yide Ran , Dimitris Stripelis , Zhaozhuo Xu , Salman Avestimehr , Chaoyang He

Towards Efficient Agents: A Co-Design of Inference Architecture and System

The rapid development of large language model (LLM)-based agents has unlocked new possibilities for autonomous multi-turn reasoning and tool-augmented decision-making. However, their real-world deployment is hindered by severe…

Computation and Language · Computer Science 2026-02-25 Weizhe Lin , Hui-Ling Zhen , Shuai Yang , Xian Wang , Renxi Liu , Hanting Chen , Wangze Zhang , Chuansai Zhou , Yiming Li , Chen Chen , Xing Li , Zhiyuan Yang , Xiaosong Li , Xianzhi Yu , Zhenhua Dong , Mingxuan Yuan , Yunhe Wang

Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

Multi-agent applications often execute complex tasks as multi-stage workflows, where each stage is an LLM call whose output becomes part of context for subsequent steps. Existing LLM serving systems largely assume homogeneous clusters with…

Machine Learning · Computer Science 2026-03-24 Kangqi Ni , Wenyue Hua , Xiaoxiang Shi , Jiang Guo , Shiyu Chang , Tianlong Chen

Optimizing Agentic Workflows using Meta-tools

Agentic AI enables LLM to dynamically reason, plan, and interact with tools to solve complex tasks. However, agentic workflows often require many iterative reasoning steps and tool invocations, leading to significant operational expense,…

Artificial Intelligence · Computer Science 2026-02-03 Sami Abuzakuk , Anne-Marie Kermarrec , Rishi Sharma , Rasmus Moorits Veski , Martijn de Vos

HERA: Hybrid Edge-cloud Resource Allocation for Cost-Efficient AI Agents

In the realm of AI, large language models (LLMs) like GPT-4, central to the operation of AI agents, predominantly operate in the cloud, incurring high operational costs. With local-based small language models (SLMs) becoming more accurate,…

Machine Learning · Computer Science 2025-04-02 Shiyi Liu , Haiying Shen , Shuai Che , Mahdi Ghandi , Mingqin Li

A Cloud-based Multi-Agentic Workflow for Science

As Large Language Models (LLMs) become ubiquitous across various scientific domains, their lack of ability to perform complex tasks like running simulations or to make complex decisions limits their utility. LLM-based agents bridge this gap…

Computation and Language · Computer Science 2026-01-21 Anurag Acharya , Timothy Vega , Rizwan A. Ashraf , Anshu Sharma , Derek Parker , Robert Rallo

A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications

Large language model (LLM)-based agents that reason, plan, and act through tools, memory, and structured interaction are emerging as a promising paradigm for automating complex workflows. Recent systems such as OpenClaw and Claude Code…

Information Retrieval · Computer Science 2026-05-27 Yingli Zhou , Wang Shu , Yaodong Su , Wenchuan Du , Yixiang Fang , Xuemin Lin

AgentServe: Algorithm-System Co-Design for Efficient Agentic AI Serving on a Consumer-Grade GPU

Large language models (LLMs) are increasingly deployed as AI agents that operate in short reasoning-action loops, interleaving model computation with external calls. Unlike traditional chat applications, these agentic workloads require…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-12 Yuning Zhang , Yan Yan , Nan Yang , Dong Yuan

Episodic Memory in Agentic Frameworks: Suggesting Next Tasks

Agentic frameworks powered by Large Language Models (LLMs) can be useful tools in scientific workflows by enabling human-AI co-creation. A key challenge is recommending the next steps during workflow creation without relying solely on LLMs,…

Multiagent Systems · Computer Science 2025-11-25 Sandro Rama Fiorini , Leonardo G. Azevedo , Raphael M. Thiago , Valesca M. de Sousa , Anton B. Labate , Viviane Torres da Silva