Related papers: Streaming Temporal Graphs: Subgraph Matching
We present a domain-specific language called SAL(the Streaming Analytics Language) for processing data in a semi-streaming model. In particular we examine the use case of processing netflow data in order to identify malicious actors within…
Streaming neural network models for fast frame-wise responses to various speech and sensory signals are widely adopted on resource-constrained platforms. Hence, increasing the learning capacity of such streaming models (i.e., by adding more…
Streaming analytics are essential in a large range of applications, including databases, networking, and machine learning. To optimize performance, practitioners are increasingly offloading such analytics to network nodes such as switches.…
Context: The combination of distributed stream processing with microservice architectures is an emerging pattern for building data-intensive software systems. In such systems, stream processing frameworks such as Apache Flink, Apache Kafka…
Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire…
To conduct real-time analytics computations, big data stream processing engines are required to process unbounded data streams at millions of events per second. However, current streaming engines exhibit low throughput and high tuple…
Triangle counting is a fundamental and widely studied problem on static graphs, and recently on temporal graphs, where edges carry information on the timings of the associated events. Streaming processing and resource efficiency are crucial…
Motivated by the trend to outsource work to commercial cloud computing services, we consider a variation of the streaming paradigm where a streaming algorithm can be assisted by a powerful helper that can provide annotations to the data…
Stream processing is extensively used in the IoT-to-Cloud spectrum to distill information from continuous streams of data. Streaming applications usually run in dedicated Stream Processing Engines (SPEs) that adopt the DataFlow model, which…
Real-time data analysis and management are increasingly critical for today`s businesses. SQL is the de facto lingua franca for these endeavors, yet support for robust streaming analysis and management with SQL remains limited. Many…
Graphs are found in a plethora of domains, including online social networks, the World Wide Web and the study of epidemics, to name a few. With the advent of greater volumes of information and the need for continuously updated results under…
Time-evolving stream datasets exist ubiquitously in many real-world applications where their inherent hot keys often evolve over times. Nevertheless, few existing solutions can provide efficient load balance on these time-evolving datasets…
Data-flow is a natural approach to parallelism. However, describing dependencies and control between fine-grained data-flow tasks can be complex and present unwanted overheads. TALM (TALM is an Architecture and Language for Multi-threading)…
Stream processing engines (SPEs) are widely used for large scale streaming analytics over unbounded time-ordered data streams. Modern day streaming analytics applications exhibit diverse compute characteristics and demand strict latency and…
Many well-known, real-world problems involve dynamic data which describe the relationship among the entities. Hypergraphs are powerful combinatorial structures that are frequently used to model such data. For many of today's data-centric…
Ubiquitous sensors today emit high frequency streams of numerical measurements that reflect properties of human, animal, industrial, commercial, and natural processes. Shifts in such processes, e.g. caused by external events or internal…
The number of triangles in a graph is a fundamental metric, used in social network analysis, link classification and recommendation, and more. Driven by these applications and the trend that modern graph datasets are both large and dynamic,…
Stream processing applications extract value from raw data through Directed Acyclic Graphs of data analysis tasks. Shared-nothing (SN) parallelism is the de-facto standard to scale stream processing applications. Given an application, SN…
Time series visualization of streaming telemetry (i.e., charting of key metrics such as server load over time) is increasingly prevalent in modern data platforms and applications. However, many existing systems simply plot the raw data…
In this work, we detail the design and structure of a Synopses Data Engine (SDE) which combines the virtues of parallel processing and stream summarization towards delivering interactive analytics at extreme scale. Our SDE is built on top…