Related papers: Henge: Intent-driven Multi-Tenant Stream Processin…

IntentContinuum: Using LLMs to Support Intent-Based Computing Across the Compute Continuum

The increasing proliferation of IoT devices and AI applications has created a demand for scalable and efficient computing solutions, particularly for applications requiring real-time processing. The compute continuum integrates edge and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-14 Negin Akbari , John Grundy , Aamir Cheema , Adel N. Toosi

TENT: A Declarative Slice Spraying Engine for Performant and Resilient Data Movement in Disaggregated LLM Serving

Modern GPU clusters are built upon a complex hierarchy of heterogeneous interconnects, ranging from multi-rail RDMA to proprietary fabrics such as Multi-Node NVLink and Ascend UB. Orchestrating these diverse links effectively remains a…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-02 Feng Ren , Ruoyu Qin , Teng Ma , Shangming Cai , Zheng Liu , Chao Lei , Dejiang Zhu , Ke Yang , Zheming Li , Jialei Cui , Weixiao Huang , Yikai Zhao , Yineng Zhang , Hao Wu , Xiang Gao , Yuhao Fu , Jinlei Jiang , Yongwei Wu , Mingxing Zhang

HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location

Large language models (LLMs) have facilitated a wide range of applications with distinct service-level objectives (SLOs), from latency-sensitive online tasks like interactive chatbots to throughput-oriented offline workloads like data…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-31 Ting Sun , Penghan Wang , Fan Lai

Move Fast and Meet Deadlines: Fine-grained Real-time Stream Processing with Cameo

Resource provisioning in multi-tenant stream processing systems faces the dual challenges of keeping resource utilization high (without over-provisioning), and ensuring performance isolation. In our common production use cases, where…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-08 Le Xu , Shivaram Venkataraman , Indranil Gupta , Luo Mai , Rahul Potharaju

Designing Co-operation in Systems of Hierarchical, Multi-objective Schedulers for Stream Processing

Stream processing is a computing paradigm that supports real-time data processing for a wide variety of applications. At Meta, it's used across the company for various tasks such as deriving product insights, providing and improving user…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-09 Animesh Dangwal , Yufeng Jiang , Charlie Arnold , Jun Fan , Mohamed Bassem , Aish Rajagopal

A Streaming End-to-End Framework For Spoken Language Understanding

End-to-end spoken language understanding (SLU) has recently attracted increasing interest. Compared to the conventional tandem-based approach that combines speech recognition and language understanding as separate modules, the new approach…

Computation and Language · Computer Science 2021-07-20 Nihal Potdar , Anderson R. Avila , Chao Xing , Dong Wang , Yiran Cao , Xiao Chen

Batch Query Processing and Optimization for Agentic Workflows

Large Language Models (LLMs) in agentic workflows combine multi-step reasoning, heterogeneous tool use, and collaboration across multiple specialized agents. Existing LLM serving engines optimize individual calls in isolation, while…

Databases · Computer Science 2026-01-21 Junyi Shen , Noppanat Wadlom , Yao Lu

Scheduling of Intermittent Query Processing

Stream processing is usually done either on a tuple-by-tuple basis or in micro-batches. There are many applications where tuples over a predefined duration/window must be processed within certain deadlines. Processing such queries using…

Databases · Computer Science 2024-09-23 Saranya Chandrasekaran , S. Sudarshan

HENS unchained: MILP implementation of multi-stage utilities with stream splits, variable temperatures and flow capacities

Heat exchanger network synthesis (HENS) is a well-studied method in research for determining cost-optimal heat exchanger networks. In this paper, we present a modified superstructure formulation to implement streams with variable…

Optimization and Control · Mathematics 2023-06-19 David Huber , Felix Birkelbach , Rene Hofmann

Dynamic Service Scheduling and Resource Management in Energy-Harvesting Multi-access Edge Computing

Multi-access Edge Computing (MEC) delivers low-latency services by hosting applications near end-users. To promote sustainability, these systems are increasingly integrated with renewable Energy Harvesting (EH) technologies, enabling…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-21 Shuyi Chen , Panagiotis Oikonomou , Zhengchang Hua , Nikos Tziritas , Karim Djemame , Nan Zhang , Georgios Theodoropoulos

To Share, or not to Share Online Event Trend Aggregation Over Bursty Event Streams

Complex event processing (CEP) systems continuously evaluate large workloads of pattern queries under tight time constraints. Event trend aggregation queries with Kleene patterns are commonly used to retrieve summarized insights about the…

Databases · Computer Science 2021-03-04 Olga Poppe , Chuan Lei , Lei Ma , Allison Rozet , Elke A. Rundensteiner

AccelGen: Heterogeneous SLO-Guaranteed High-Throughput LLM Inference Serving for Diverse Applications

In this paper, we consider a mixed-prompt scenario for a large language model (LLM) inference serving system that supports diverse applications with both short prompts and long prompts and heterogeneous SLOs for iteration time. To improve…

Computation and Language · Computer Science 2025-03-19 Haiying Shen , Tanmoy Sen

Semantic Caching and Intent-Driven Context Optimization for Multi-Agent Natural Language to Code Systems

We present a production-optimized multi-agent system designed to translate natural language queries into executable Python code for structured data analytics. Unlike systems that rely on expensive frontier models, our approach achieves high…

Software Engineering · Computer Science 2026-01-21 Harmohit Singh

Flow Splitting with Fate Sharing in a Next Generation Transport Services Architecture

The challenges of optimizing end-to-end performance over diverse Internet paths has driven widespread adoption of in-path optimizers, which can destructively interfere with TCP's end-to-end semantics and with each other, and are…

Networking and Internet Architecture · Computer Science 2009-12-07 Janardhan Iyengar , Bryan Ford

Online Learning for Autonomous Management of Intent-based 6G Networks

The growing complexity of networks and the variety of future scenarios with diverse and often stringent performance requirements call for a higher level of automation. Intent-based management emerges as a solution to attain high level of…

Networking and Internet Architecture · Computer Science 2024-07-26 Erciyes Karakaya , Ozgur Ercetin , Huseyin Ozkan , Mehmet Karaca , Elham Dehghan Biyar , Alexandros Palaios

Automating Multi-Tenancy Performance Evaluation on Edge Compute Nodes

Edge Computing emerges as a promising alternative of Cloud Computing, with scalable compute resources and services deployed in the path between IoT devices and Cloud. Since virtualization techniques can be applied on Edge compute nodes,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-13 Joanna Georgiou , Moysis Symeonides , George Pallis , Marios D. Dikaiakos

On the Semantic Overlap of Operators in Stream Processing Engines

Stream processing is extensively used in the IoT-to-Cloud spectrum to distill information from continuous streams of data. Streaming applications usually run in dedicated Stream Processing Engines (SPEs) that adopt the DataFlow model, which…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-03-03 Vincenzo Gulisano , Alessandro Margara , Marina Papatriantafilou

Enabling SLO-Aware 5G Multi-Access Edge Computing with SMEC

Multi-access edge computing (MEC) promises to enable latency-critical applications by bringing computational power closer to mobile devices, but our measurements on commercial MEC deployments reveal frequent SLO violations due to high tail…

Networking and Internet Architecture · Computer Science 2026-03-24 Xiao Zhang , Daehyeok Kim

RefinementEngine: Automating Intent-to-Device Filtering Policy Deployment under Network Constraints

Translating security intent into deployable network enforcement rules and maintaining their effectiveness despite evolving cyber threats remains a largely manual process in most Security Operations Centers (SOCs). In large and heterogeneous…

Cryptography and Security · Computer Science 2026-04-03 Davide Colaiacomo , Chiara Bonfanti , Cataldo Basile

HEATS: Heterogeneity- and Energy-Aware Task-based Scheduling

Cloud providers usually offer diverse types of hardware for their users. Customers exploit this option to deploy cloud instances featuring GPUs, FPGAs, architectures other than x86 (e.g., ARM, IBM Power8), or featuring certain specific…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-28 Isabelly Rocha , Christian Göttel , Pascal Felber , Marcelo Pasin , Romain Rouvoy , Valerio Schiavoni