Related papers: A Synopses Data Engine for Interactive Extreme-Sca…

Splash: User-friendly Programming Interface for Parallelizing Stochastic Algorithms

Stochastic algorithms are efficient approaches to solving machine learning and optimization problems. In this paper, we propose a general framework called Splash for parallelizing stochastic algorithms on multi-node distributed systems.…

Machine Learning · Computer Science 2015-09-24 Yuchen Zhang , Michael I. Jordan

S+t-SNE -- Bringing Dimensionality Reduction to Data Streams

We present S+t-SNE, an adaptation of the t-SNE algorithm designed to handle infinite data streams. The core idea behind S+t-SNE is to update the t-SNE embedding incrementally as new data arrives, ensuring scalability and adaptability to…

Artificial Intelligence · Computer Science 2025-01-22 Pedro C. Vieira , João P. Montrezol , João T. Vieira , João Gama

Sharing of Semantically Enhanced Information for the Adaptive Execution of Business Processes

Motivated from the Context Aware Computing, and more particularly from the Data-Driven Process Adaptation approach, we propose the Semantic Context Space (SCS) Engine which aims to facilitate the provision of adaptable business processes.…

Software Engineering · Computer Science 2013-04-09 Pigi Kouki

On SDN-Enabled Online and Dynamic Bandwidth Allocation for Stream Analytics

Data communication in cloud-based distributed stream data analytics often involves a collection of parallel and pipelined TCP flows. As the standard TCP congestion control mechanism is designed for achieving "fairness" among competing flows…

Networking and Internet Architecture · Computer Science 2019-08-08 Walid Aljoby , Xin Wang , Tom Z. J. Fu , Richard T. B. Ma

A Proactive Management Scheme for Data Synopses at the Edge

The combination of the infrastructure provided by the Internet of Things (IoT) with numerous processing nodes present at the Edge Computing (EC) ecosystem opens up new pathways to support intelligent applications. Such applications can be…

Machine Learning · Computer Science 2021-07-23 Kostas Kolomvatsos , Christos Anagnostopoulos

Data Synopses Management based on a Deep Learning Model

Pervasive computing involves the placement of processing services close to end users to support intelligent applications. With the advent of the Internet of Things (IoT) and the Edge Computing (EC), one can find room for placing services at…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-05 Panagiotis Fountas , Kostas Kolomvatsos , Christos Anagnostopoulos

On the Semantic Overlap of Operators in Stream Processing Engines

Stream processing is extensively used in the IoT-to-Cloud spectrum to distill information from continuous streams of data. Streaming applications usually run in dedicated Stream Processing Engines (SPEs) that adopt the DataFlow model, which…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-03-03 Vincenzo Gulisano , Alessandro Margara , Marina Papatriantafilou

An Ensemble Scheme for Proactive Data Allocation in Distributed Datasets

The advent of the Internet of Things (IoT) gives the opportunity to numerous devices to interact with their environment, collect and process data. Data are transferred, in an upwards mode, to the Cloud through the Edge Computing (EC)…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-29 T. Koukaras , K. Kolomvatsos

SYNAPSE: Empowering LLM Agents with Episodic-Semantic Memory via Spreading Activation

While Large Language Models (LLMs) excel at generalized reasoning, standard retrieval-augmented approaches fail to address the disconnected nature of long-term agentic memory. To bridge this gap, we introduce Synapse (Synergistic…

Computation and Language · Computer Science 2026-02-17 Hanqi Jiang , Junhao Chen , Yi Pan , Ling Chen , Weihang You , Yifan Zhou , Ruidong Zhang , Andrea Sikora , Lin Zhao , Yohannes Abate , Tianming Liu

Streaming Temporal Graphs: Subgraph Matching

We investigate solutions to subgraph matching within a temporal stream of data. We present a high-level language for describing temporal subgraphs of interest, the Streaming Analytics Language (SAL). SAL programs are translated into C++…

Programming Languages · Computer Science 2020-04-02 Eric L. Goodman , Dirk Grunwald

Towards Interactive, Adaptive and Result-aware Big Data Analytics

As data volumes grow across applications, analytics of large amounts of data is becoming increasingly important. Big data processing frameworks such as Apache Hadoop, Apache AsterixDB, and Apache Spark have been built to meet this demand. A…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-15 Avinash Kumar

Approximate Stream Analytics in Apache Flink and Apache Spark Streaming

Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-12 Do Le Quoc , Ruichuan Chen , Pramod Bhatotia , Christof Fetze , Volker Hilt , Thorsten Strufe

Benchmarking Distributed Stream Data Processing Systems

The need for scalable and efficient stream analysis has led to the development of many open-source streaming data processing systems (SDPSs) with highly diverging capabilities and performance characteristics. While first initiatives try to…

Databases · Computer Science 2019-06-27 Jeyhun Karimov , Tilmann Rabl , Asterios Katsifodimos , Roman Samarev , Henri Heiskanen , Volker Markl

S2CE: A Hybrid Cloud and Edge Orchestrator for Mining Exascale Distributed Streams

The explosive increase in volume, velocity, variety, and veracity of data generated by distributed and heterogeneous nodes such as IoT and other devices, continuously challenge the state of art in big data processing platforms and mining…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-03 Nicolas Kourtellis , Herodotos Herodotou , Maciej Grzenda , Piotr Wawrzyniak , Albert Bifet

Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark

Real-world data from diverse domains require real-time scalable analysis. Large-scale data processing frameworks or engines such as Hadoop fall short when results are needed on-the-fly. Apache Spark's streaming library is increasingly…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-08-02 Janak Dahal , Elias Ioup , Shaikh Arifuzzaman , Mahdi Abdelguerfi

Incremental Query Processing on Big Data Streams

This paper addresses online query processing for large-scale, incremental data analysis on a distributed stream processing engine (DSPE). Our goal is to convert any SQL-like query to an incremental DSPE program automatically. In contrast to…

Databases · Computer Science 2016-08-23 Leonidas Fegaras

Expert Streaming: Accelerating Low-Batch MoE Inference via Multi-chiplet Architecture and Dynamic Expert Trajectory Scheduling

Mixture-of-Experts is a promising approach for edge AI with low-batch inference. Yet, on-device deployments often face limited on-chip memory and severe workload imbalance; the prevalent use of offloading further incurs off-chip memory…

Hardware Architecture · Computer Science 2026-03-31 Songchen Ma , Hongyi Li , Weihao Zhang , Yonghao Tan , Pingcheng Dong , Yu Liu , Lan Liu , Yuzhong Jiao , Xuejiao Liu , Luhong Liang , Kwang-Ting Cheng

A Switched Dynamical System Framework for Analysis of Massively Parallel Asynchronous Numerical Algorithms

In the near future, massively parallel computing systems will be necessary to solve computation intensive applications. The key bottleneck in massively parallel implementation of numerical algorithms is the synchronization of data across…

Systems and Control · Computer Science 2015-03-16 Kooktae Lee , Raktim Bhattacharya , Vijay Gupta

Asynchronous Complex Analytics in a Distributed Dataflow Architecture

Scalable distributed dataflow systems have recently experienced widespread adoption, with commodity dataflow engines such as Hadoop and Spark, and even commodity SQL engines routinely supporting increasingly sophisticated analytics tasks…

Databases · Computer Science 2015-10-27 Joseph E. Gonzalez , Peter Bailis , Michael I. Jordan , Michael J. Franklin , Joseph M. Hellerstein , Ali Ghodsi , Ion Stoica

OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates

A key need in different disciplines is to perform analytics over fast-paced data streams, similar in nature to the traditional OLAP analytics in relational databases i.e., with filters and aggregates. Storing unbounded streams, however, is…

Databases · Computer Science 2023-09-13 Wieger R. Punter , Odysseas Papapetrou , Minos Garofalakis