English
Related papers

Related papers: A Policy-Driven Runtime Layer for Agentic LLM Serv…

200 papers

Large language model (LLM) based agentic workflows have become a popular paradigm for coordinating multiple specialized agents to solve complex tasks. To improve serving efficiency, existing LLM systems employ prefix caching to reuse…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-11 Zaifeng Pan , Ajjkumar Patel , Zhengding Hu , Yipeng Shen , Yue Guan , Wan-Lu Li , Lianhui Qin , Yida Wang , Yufei Ding

Agentic workflows are composed of sequences of interdependent Large Language Model (LLM) calls, and they have become a dominant workload in modern AI systems. These workflows exhibit extensive redundancy from overlapping prompts and…

Multiagent Systems · Computer Science 2026-03-18 Noppanat Wadlom , Junyi Shen , Yao Lu

Recent advancements in Large Language Model (LLM) agents have enabled complex multi-turn agentic tasks requiring extensive tool calling, where conversations can span dozens of API calls with increasingly large context windows. However,…

Computation and Language · Computer Science 2026-02-03 Elias Lumer , Faheem Nizar , Akshaya Jangiti , Kevin Frank , Anmol Gulati , Mandar Phadate , Vamse Kumar Subbiah

Large language models (LLMs) have shown promise for automated patching, but their effectiveness depends strongly on how they are integrated into patching systems. While prior work explores prompting strategies and individual agent designs,…

Cryptography and Security · Computer Science 2026-03-03 Qingxiao Xu , Ze Sheng , Zhicheng Chen , Jeff Huang

The rise of multi-agent systems powered by large language models (LLMs) and specialized reasoning agents exposes fundamental limitations in today's data management architectures. Traditional databases and data fabrics were designed for…

Multiagent Systems · Computer Science 2025-12-11 Ioana Giurgiu , Michael E. Nidd

Large Language Models (LLMs) have achieved remarkable success across a wide range of tasks, but serving them efficiently at scale remains a critical challenge due to their substantial computational and latency demands. While most existing…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-04 Yifan Sun , Gholamreza Haffari , Minxian Xu , Rajkumar Buyya , Adel N. Toosi

LLM-based agent applications have shown increasingly remarkable capabilities in complex workflows but incur substantial costs and latency due to extensive planning and reasoning requirements. Existing LLM caching techniques (like context…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-28 Qizheng Zhang , Michael Wornow , Gerry Wan , Kunle Olukotun

Multi-agent LLM systems on edge devices face a memory management problem: device RAM is too small to hold every agent's KV cache simultaneously. On Apple M4 Pro with 10.2 GB of cache budget, only 3 agents fit at 8K context in FP16. A…

Machine Learning · Computer Science 2026-03-06 Yakov Pyotr Shkolnikov

As LLM agents evolve into collaborative multi-agent systems, their memory requirements grow rapidly in complexity. This position paper frames multi-agent memory as a computer architecture problem. We distinguish shared and distributed…

Hardware Architecture · Computer Science 2026-04-01 Zhongming Yu , Naicheng Yu , Hejia Zhang , Wentao Ni , Mingrui Yin , Jiaying Yang , Yujie Zhao , Jishen Zhao

Large language models (LLMs) are increasingly deployed as AI agents that operate in short reasoning-action loops, interleaving model computation with external calls. Unlike traditional chat applications, these agentic workloads require…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-12 Yuning Zhang , Yan Yan , Nan Yang , Dong Yuan

Agentic AI shifts LLM serving from isolated prompt-generation requests to stateful, multi-turn executions that repeatedly invoke the model, call tools, and grow context over time. This paper characterizes ReAct-style agents from both the…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-27 Yichao Yuan , Ankita Nayak , Souvik Kundu , Nishil Talati

Agent orchestration frameworks have proliferated, collectively exceeding 290,000 GitHub stars across LangGraph, CrewAI, Google ADK, OpenAI Agents SDK, Semantic Kernel, Strands, and LlamaIndex. All follow the same pattern: an external…

Artificial Intelligence · Computer Science 2026-05-22 Simon Dennis , Rivaan Patil , Kevin Shabahang , Hao Guo

Multi-agent systems built on large language models (LLMs) require many coordination choices that are difficult to fix a priori: which skill protocol to invoke, which agent role should perform a subtask, which model to bind to each role, how…

Multiagent Systems · Computer Science 2026-05-28 Nicole Koenigstein

Multi-agent large language model (LLM) systems are increasingly adopted for complex language processing tasks that require communication and coordination among agents. However, these systems often suffer substantial overhead from repeated…

Multiagent Systems · Computer Science 2025-11-04 Hancheng Ye , Zhengqi Gao , Mingyuan Ma , Qinsi Wang , Yuzhe Fu , Ming-Yu Chung , Yueqian Lin , Zhijian Liu , Jianyi Zhang , Danyang Zhuo , Yiran Chen

Large Language Models (LLMs), such as OpenAI-o1 and DeepSeek-R1, have demonstrated strong reasoning capabilities. To further enhance LLM capabilities, recent agentic systems, such as Deep Research, incorporate web interactions into LLM…

Artificial Intelligence · Computer Science 2025-10-21 Song Bian , Minghao Yan , Anand Jayarajan , Gennady Pekhimenko , Shivaram Venkataraman

The increasing complexity of AI tasks has shifted the paradigm from monolithic models toward multi-agent large language model (LLM) systems. However, these collaborative architectures introduce a critical bottleneck: redundant prefill…

Machine Learning · Computer Science 2026-03-17 Yingsheng Geng , Yuchong Gao , Weihong Wu , Guyue Liu , Jiang Liu

Recent advances in LLM-based multi-agent systems (MAS) show that workflows composed of multiple LLM agents with distinct roles, tools, and communication patterns can outperform single-LLM baselines on complex tasks. However, most frameworks…

Multiagent Systems · Computer Science 2026-01-21 Jiawei Xu , Arief Koesdwiady , Sisong Bei , Yan Han , Baixiang Huang , Dakuo Wang , Yutong Chen , Zheshen Wang , Peihao Wang , Pan Li , Ying Ding

Large language models are increasingly deployed as complex agentic systems that scale with task complexity. While prior work has extensively explored model- and system-level scaling, algorithm- and task-level scaling remain largely…

Artificial Intelligence · Computer Science 2026-04-21 Zizhang Luo , Yuhao Luo , Youwei Xiao , Yansong Xu , Runlin Guo , Yun Liang

Automated intrusion-style workflows require LLM agents to reason over partial observations, tool outputs, and executable artifacts under bounded budgets. A single LLM instance often compresses evidence extraction, planning, execution, and…

Cryptography and Security · Computer Science 2026-05-12 Minfeng Qi , Tianqing Zhu , Zijie Xu , Congcong Zhu , Qin Wang , Wanlei Zhou

LLM-based agents for industrial asset operations show limited accuracy when reasoning over flat document stores. AssetOpsBench (KDD 2026) establishes that GPT-4 agents achieve 65% on 139 industrial maintenance scenarios backed by CouchDB,…

Databases · Computer Science 2026-05-27 Madhulatha Mandarapu , Sandeep Kunkunuru
‹ Prev 1 2 3 10 Next ›