Related papers: IntentContinuum: Using LLMs to Support Intent-Base…

Semantic Routing for Enhanced Performance of LLM-Assisted Intent-Based 5G Core Network Management and Orchestration

Large language models (LLMs) are rapidly emerging in Artificial Intelligence (AI) applications, especially in the fields of natural language processing and generative AI. Not limited to text generation applications, these models inherently…

Networking and Internet Architecture · Computer Science 2024-04-25 Dimitrios Michael Manias , Ali Chouman , Abdallah Shami

Intent-Driven Storage Systems: From Low-Level Tuning to High-Level Understanding

Existing storage systems lack visibility into workload intent, limiting their ability to adapt to the semantics of modern, large-scale data-intensive applications. This disconnect leads to brittle heuristics and fragmented, siloed…

Hardware Architecture · Computer Science 2025-10-21 Shai Bergman , Won Wook Song , Lukas Cavigelli , Konstantin Berestizshevsky , Ke Zhou , Ji Zhang

LLM-based policy generation for intent-based management of applications

Automated management requires decomposing high-level user requests, such as intents, to an abstraction that the system can understand and execute. This is challenging because even a simple intent requires performing a number of ordered…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-02-16 Kristina Dzeparoska , Jieyu Lin , Ali Tizghadam , Alberto Leon-Garcia

Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache

Large Language Models (LLMs) demonstrate substantial potential across a diverse array of domains via request serving. However, as trends continue to push for expanding context sizes, the autoregressive nature of LLMs results in highly…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-08 Bin Lin , Chen Zhang , Tao Peng , Hanyu Zhao , Wencong Xiao , Minmin Sun , Anmin Liu , Zhipeng Zhang , Lanbo Li , Xiafei Qiu , Shen Li , Zhigang Ji , Tao Xie , Yong Li , Wei Lin

Comparative Analysis of Large Language Models for the Machine-Assisted Resolution of User Intentions

Large Language Models (LLMs) have emerged as transformative tools for natural language understanding and user intent resolution, enabling tasks such as translation, summarization, and, increasingly, the orchestration of complex workflows.…

Software Engineering · Computer Science 2025-11-12 Justus Flerlage , Alexander Acker , Odej Kao

iServe: An Intent-based Serving System for LLMs

Large Language Models (LLMs) are becoming ubiquitous across industries, where applications demand they fulfill diverse user intents. However, developers currently face the challenge of manually exploring numerous deployment configurations -…

Software Engineering · Computer Science 2025-01-24 Dimitrios Liakopoulos , Tianrui Hu , Prasoon Sinha , Neeraja J. Yadwadkar

Scaling Up Throughput-oriented LLM Inference Applications on Heterogeneous Opportunistic GPU Clusters with Pervasive Context Management

The widespread growth in LLM developments increasingly demands more computational power from clusters than what they can supply. Traditional LLM applications inherently require huge static resource allocations, which force users to either…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-17 Thanh Son Phung , Douglas Thain

Towards Intent-Based Network Management: Large Language Models for Intent Extraction in 5G Core Networks

The integration of Machine Learning and Artificial Intelligence (ML/AI) into fifth-generation (5G) networks has made evident the limitations of network intelligence with ever-increasing, strenuous requirements for current and…

Networking and Internet Architecture · Computer Science 2024-05-24 Dimitrios Michael Manias , Ali Chouman , Abdallah Shami

Cloud-native and Distributed Systems for Efficient and Scalable Large Language Models -- A Research Agenda

The rapid rise of Large Language Models (LLMs) has revolutionized various artificial intelligence (AI) applications, from natural language processing to code generation. However, the computational demands of these models, particularly in…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-21 Minxian Xu , Jingfeng Wu , Shengye Song , Satish Narayana Srirama , Bahman Javad , Rajiv Ranjan , Devki Nandan Jha , Sa Wang , Wenhong Tian , Huanle Xu , Li Li , Zizhao Mo , Shuo Ren , Thomas Kunz , Petar Kochovski , Vlado Stankovski , Kejiang Ye , Chengzhong Xu , Rajkumar Buyya

Dynamic Resource Manager for Automating Deployments in the Computing Continuum

With the growth of real-time applications and IoT devices, computation is moving from cloud-based services to the low latency edge, creating a computing continuum. This continuum includes diverse cloud, edge, and endpoint devices, posing…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-11-27 Zahra Najafabadi Samani , Matthias Gassner , Thomas Fahringer , Juan Aznar Poveda , Stefan Pedratscher

Large Language Models over Networks: Collaborative Intelligence under Resource Constraints

Large language models (LLMs) are transforming society, powering applications from smartphone assistants to autonomous driving. Yet cloud-based LLM services alone cannot serve a growing class of applications, including those operating under…

Signal Processing · Electrical Eng. & Systems 2026-05-12 Liangqi Yuan , Wenzhi Fang , Shiqiang Wang , H. Vincent Poor , Christopher G. Brinton

SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization

Large language models (LLMs) have been a disruptive innovation in recent years, and they play a crucial role in our daily lives due to their ability to understand and generate human-like text. Their capabilities include natural language…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-17 Akrit Mudvari , Yuang Jiang , Leandros Tassiulas

Intent Assurance using LLMs guided by Intent Drift

Intent-Based Networking (IBN) presents a paradigm shift for network management, by promising to align intents and business objectives with network operations--in an automated manner. However, its practical realization is challenging: 1)…

Artificial Intelligence · Computer Science 2024-02-06 Kristina Dzeparoska , Ali Tizghadam , Alberto Leon-Garcia

NetIntent: Leveraging Large Language Models for End-to-End Intent-Based SDN Automation

Intent-Based Networking (IBN) often leverages the programmability of Software-Defined Networking (SDN) to simplify network management. However, significant challenges remain in automating the entire pipeline, from user-specified high-level…

Networking and Internet Architecture · Computer Science 2026-02-27 Md. Kamrul Hossain , Walid Aljoby

A Survey on Time-Sensitive Resource Allocation in the Cloud Continuum

Artificial Intelligence (AI) and Internet of Things (IoT) applications are rapidly growing in today's world where they are continuously connected to the internet and process, store and exchange information among the devices and the…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-01 Saravanan Ramanathan , Nitin Shivaraman , Seima Suryasekaran , Arvind Easwaran , Etienne Borde , Sebastian Steinhorst

Workflow-Driven Modeling for the Compute Continuum: An Optimization Approach to Automated System and Workload Scheduling

The convergence of IoT, Edge, Cloud, and HPC technologies creates a compute continuum that merges cloud scalability and flexibility with HPC's computational power and specialized optimizations. However, integrating cloud and HPC resources…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-20 Aasish Kumar Sharma , Christian Boehme , Patrick Gelß , Ramin Yahyapour , Julian Kunkel

LLM-Based Intent Processing and Network Optimization Using Attention-Based Hierarchical Reinforcement Learning

Intent-based network automation is a promising tool to enable easier network management however certain challenges need to be effectively addressed. These are: 1) processing intents, i.e., identification of logic and necessary parameters to…

Networking and Internet Architecture · Computer Science 2024-12-24 Md Arafat Habib , Pedro Enrique Iturria Rivera , Yigit Ozcan , Medhat Elsayed , Majid Bavand , Raimundus Gaigalas , Melike Erol-Kantarci

AcceLLM: Accelerating LLM Inference using Redundancy for Load Balancing and Data Locality

Large Language Model (LLM) inference on large-scale systems is expected to dominate future cloud infrastructures. Efficient LLM inference in cloud environments with numerous AI accelerators is challenging, necessitating extensive…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-11-11 Ilias Bournias , Lukas Cavigelli , Georgios Zacharopoulos

Intention and Context Elicitation with Large Language Models in the Legal Aid Intake Process

Large Language Models (LLMs) and chatbots show significant promise in streamlining the legal intake process. This advancement can greatly reduce the workload and costs for legal aid organizations, improving availability while making legal…

Computers and Society · Computer Science 2023-11-23 Nick Goodson , Rongfei Lu

Continuous Reasoning for Managing Next-Gen Distributed Applications

Continuous reasoning has proven effective in incrementally analysing changes in application codebases within Continuous Integration/Continuous Deployment (CI/CD) software release pipelines. In this article, we present a novel declarative…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-23 Stefano Forti , Antonio Brogi