Related papers: The Renoir Dataflow Platform: Efficient Data Proce…
We present Rhino, a system for accelerating tensor programs with automatic parallelization on AI platform for real production environment. It transforms a tensor program written for a single device into an equivalent distributed program…
Today, organizations typically perform tedious and costly tasks to juggle their code and data across different data processing platforms. Addressing this pain and achieving automatic cross-platform data processing is quite challenging…
We present Pathway, a new unified data processing framework that can run workloads on both bounded and unbounded data streams. The framework was created with the original motivation of resolving challenges faced when analyzing and…
The amazing advances being made in the fields of machine and deep learning are a highlight of the Big Data era for both enterprise and research communities. Modern applications require resources beyond a single node's ability to provide.…
The exponential growth in smart sensors and rapid progress in 5G networks is creating a world awash with data streams. However, a key barrier to building performant multi-sensor, distributed stream processing applications is high…
In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models, for which only…
Stream reasoning systems are designed for complex decision-making from possibly infinite, dynamic streams of data. Modern approaches to stream reasoning are usually performing their computations using stand-alone solvers, which…
This paper introduces FlowUnits, a novel programming and deployment model that extends the traditional dataflow paradigm to address the unique challenges of edge-to-cloud computing environments. While conventional dataflow systems offer…
Reservoir computing is a recently introduced, highly efficient bio-inspired approach for processing time dependent data. The basic scheme of reservoir computing consists of a non linear recurrent dynamical system coupled to a single input…
From logical reasoning to mental simulation, biological and artificial neural systems possess an incredible capacity for computation. Such neural computers offer a fundamentally novel computing paradigm by representing data continuously and…
R is a numerical computing environment that is widely popular for statistical data analysis. Like many such environments, R performs poorly for large datasets whose sizes exceed that of physical memory. We present our vision of RIOT (R with…
The rapidly growing demand for high-quality data in Large Language Models (LLMs) has intensified the need for scalable, reliable, and semantically rich data preparation pipelines. However, current practices remain dominated by ad-hoc…
Robotic middleware serves as the foundational infrastructure, enabling complex robotic systems to operate in a coordinated and modular manner. In data-intensive robotic applications, especially in industrial scenarios, communication…
Analyzing big data in a highly dynamic environment becomes more and more critical because of the increasingly need for end-to-end processing of this data. Modern data flows are quite complex and there are not efficient, cost-based,…
This article presents an innovative open-source software named ModelFLOWs-app, written in Python, which has been created and tested to generate precise and robust hybrid reduced order models (ROMs) fully data-driven. By integrating modal…
In edge computing use cases (e.g., smart cities), where several users and devices may be in close proximity to each other, computational tasks with similar input data for the same services (e.g., image or video annotation) may be offloaded…
Reservoir Computing (RC) is a powerful computational paradigm that allows high versatility with cheap learning. While other artificial intelligence approaches need exhaustive resources to specify their inner workings, RC is based on a…
The ability to express a program as a hierarchical composition of parts is an essential tool in managing the complexity of software and a key abstraction this provides is to separate the representation of data from the computation. Many…
In many application domains, the proliferation of sensors and devices is generating vast volumes of data, imposing significant pressure on existing data analysis and data mining techniques. Nevertheless, an increase in data volume does not…
We consider two classes of stream-based computations which admit taking linear combinations of execution runs: probabilistic sampling and generalized animation. The dataflow architecture is a natural platform for programming with streams.…