Related papers: Floe: A Continuous Dataflow Framework for Dynamic …
Modern cloud architectures demand self-adaptive capabilities to manage dynamic operational conditions. Yet, existing solutions often impose centralized control models ill-suited to microservices decentralized nature. This paper presents…
Stream workflow application such as online anomaly detection or online traffic monitoring, integrates multiple streaming big data applications into data analysis pipeline. This application can be highly dynamic in nature, where the data…
The Internet of Things (IoT) is offering unprecedented observational data that are used for managing Smart City utilities. Edge and Fog gateway devices are an integral part of IoT deployments to acquire real-time data and enact controls.…
In this paper we introduce vFlow - A framework for rapid designing of batch processing applications for Cloud Computing environment. vFlow batch processing system extracts tasks from the vPlans diagrams, systematically captures the dynamics…
Deploying large language models (LLMs) in real-time systems remains challenging due to their substantial computational demands and privacy concerns. We propose Floe, a hybrid federated learning framework designed for latency-sensitive,…
The increasing complexity of IoT applications and the continuous growth in data generated by connected devices have led to significant challenges in managing resources and meeting performance requirements in computing continuum…
Serverless computing that runs functions with auto-scaling is a popular task execution pattern in the cloud-native era. By connecting serverless functions into workflows, tenants can achieve complex functionality. Prior researches adopt the…
Stream applications are widely deployed on the cloud. While modern distributed streaming systems like Flink and Spark Streaming can schedule and execute them efficiently, streaming dataflows are often dynamically changing, which may cause…
Experimental science is increasingly driven by instruments that produce vast volumes of data and thus a need to manage, compute, describe, and index this data. High performance and distributed computing provide the means of addressing the…
We present DataFlow, a computational framework for building, testing, and deploying high-performance machine learning systems on unbounded time-series data. Traditional data science workflows assume finite datasets and require substantial…
Fog computing extends the cloud computing paradigm by allocating substantial portions of computations and services towards the edge of a network, and is, therefore, particularly suitable for large-scale, geo-distributed, and data-intensive…
Under several emerging application scenarios, such as in smart cities, operational monitoring of large infrastructure, wearable assistance, and Internet of Things, continuous data streams must be processed under very short delays. Several…
In the era of big data, managing dynamic data flows efficiently is crucial as traditional storage models struggle with real-time regulation and risk overflow. This paper introduces Data Dams, a novel framework designed to optimize data…
This paper presents a stream-oriented architecture for structuring cluster applications. Clusters that run applications based on this architecture can scale to tenths of thousands of nodes with significantly less performance loss or…
This paper introduces FlowUnits, a novel programming and deployment model that extends the traditional dataflow paradigm to address the unique challenges of edge-to-cloud computing environments. While conventional dataflow systems offer…
Internet of Things (IoT) is leading to the pervasive availability of streaming data about the physical world, coupled with edge computing infrastructure deployed as part of smart cities and 5G rollout. These constrained, less reliable but…
Edge-cloud collaborative inference is becoming a practical necessity for LLM-powered edge devices: on-device models often cannot afford the required reasoning capability, while cloud-only inference could be prohibitively costly and slow…
Scientific simulation leveraging high-performance computing (HPC) systems is crucial for modeling complex systems and phenomena in fields such as astrophysics, climate science, and fluid dynamics, generating massive datasets that often…
Modern cloud-native systems increasingly rely on multi-cluster deployments to support scalability, resilience, and geographic distribution. However, existing resource management approaches remain largely reactive and cluster-centric,…
In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models, for which only…