Related papers: In-Browser Split-Execution Support for Interactive…

Afterburner: The Case for In-Browser Analytics

This paper explores the novel and unconventional idea of implementing an analytical RDBMS in pure JavaScript so that it runs completely inside a browser with no external dependencies. Our prototype, called Afterburner, generates compiled…

Databases · Computer Science 2016-05-16 Kareem El Gebaly , Jimmy Lin

Accelerating R-based Analytics on the Cloud

This paper addresses how the benefits of cloud-based infrastructure can be harnessed for analytical workloads. Often the software handling analytical workloads is not developed by a professional programmer, but on an ad hoc basis by…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-08-14 Ishan Patel , Andrew Rau-Chaplin , Blesson Varghese

Experimental Performance Evaluation of Cloud-Based Analytics-as-a-Service

An increasing number of Analytics-as-a-Service solutions has recently seen the light, in the landscape of cloud-based services. These services allow flexible composition of compute and storage components, that create powerful data ingestion…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-03-16 Francesco Pace , Marco Milanesio , Daniele Venzano , Damiano Carra , Pietro Michiardi

Towards a Unified Architecture for in-RDBMS Analytics

The increasing use of statistical data analysis in enterprise applications has created an arms race among database vendors to offer ever more sophisticated in-database analytics. One challenge in this race is that each new statistical…

Databases · Computer Science 2015-03-20 Xixuan Feng , Arun Kumar , Ben Recht , Christopher Ré

Reproducible and Portable Big Data Analytics in the Cloud

Cloud computing has become a major approach to help reproduce computational experiments. Yet there are still two main difficulties in reproducing batch based big data analytics (including descriptive and predictive analytics) in the cloud.…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-03-13 Xin Wang , Pei Guo , Xingyan Li , Aryya Gangopadhyay , Carl E. Busart , Jade Freeman , Jianwu Wang

A Visual Analytics Framework for Distributed Data Analysis Systems

This paper proposes a visual analytics framework that addresses the complex user interactions required through a command-line interface to run analyses in distributed data analysis systems. The visual analytics framework facilitates the…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-12 Abdullah-Al-Raihan Nayeem , Mohammed Elshambakey , Todd Dobbs , Huikyo Lee , Daniel Crichton , Yimin Zhu , Chanachok Chokwitthaya , William J. Tolone , Isaac Cho

Architectural Impact on Performance of In-memory Data Analytics: Apache Spark Case Study

While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-04-29 Ahsan Javed Awan , Mats Brorsson , Vladimir Vlassov , Eduard Ayguade

Architecture for Analysis of Streaming Data

While several attempts have been made to construct a scalable and flexible architecture for analysis of streaming data, no general model to tackle this task exists. Thus, our goal is to build a scalable and maintainable architecture for…

Software Engineering · Computer Science 2019-07-22 Sheik Hoque , Andriy Miranskyy

Online and Offline Analysis of Streaming Data

Online and offline analytics have been traditionally treated separately in software architecture design, and there is no existing general architecture that can support both. Our objective is to go beyond and introduce a scalable and…

Software Engineering · Computer Science 2019-07-22 Sheik Hoque , Andriy Miranskyy

Analytical Engines With Context-Rich Processing: Towards Efficient Next-Generation Analytics

As modern data pipelines continue to collect, produce, and store a variety of data formats, extracting and combining value from traditional and context-rich sources such as strings, text, video, audio, and logs becomes a manual process…

Databases · Computer Science 2023-12-05 Viktor Sanca , Anastasia Ailamaki

Analytics-as-a-Service in a Multi-Cloud Environment through Semantically enabled Hierarchical Data Processing

A large number of cloud middleware platforms and tools are deployed to support a variety of Internet of Things (IoT) data analytics tasks. It is a common practice that such cloud platforms are only used by its owners to achieve their…

Networking and Internet Architecture · Computer Science 2016-06-28 Prem Prakash Jayaraman , Charith Perera , Dimitrios Georgakopoulos , Schahram Dustdar , Dhavalkumar Thakker , Rajiv Ranjan

On the Feasibility and Implications of Self-Contained Search Engines in the Browser

JavaScript engines inside modern browsers are capable of running sophisticated multi-player games, rendering impressive 3D scenes, and supporting complex, interactive visualizations. Can this processing power be harnessed for information…

Information Retrieval · Computer Science 2014-10-17 Jimmy Lin

StreamingHub: Interactive Stream Analysis Workflows

Reusable data/code and reproducible analyses are foundational to quality research. This aspect, however, is often overlooked when designing interactive stream analysis workflows for time-series data (e.g., eye-tracking data). A mechanism to…

Databases · Computer Science 2022-06-20 Yasith Jayawardana , Vikas G. Ashok , Sampath Jayarathna

An Automated Implementation of Hybrid Cloud for Performance Evaluation of Distributed Databases

A Hybrid cloud is an integration of resources between private and public clouds. It enables users to horizontally scale their on-premises infrastructure up to public clouds in order to improve performance and cut up-front investment cost.…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-05 Yaser Mansouri , Victor Prokhorenko , M. Ali Babar

Continuous evaluation of the performance of cloud infrastructure for scientific applications

Cloud computing recently developed into a viable alternative to on-premises systems for executing high-performance computing (HPC) applications. With the emergence of new vendors and hardware options, there is now a growing need to…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-12-14 Mohammad Mohammadi , Timur Bazhirov

ArcaDB: A Container-based Disaggregated Query Engine for Heterogenous Computational Environments

Modern enterprises rely on data management systems to collect, store, and analyze vast amounts of data related with their operations. Nowadays, clusters and hardware accelerators (e.g., GPUs, TPUs) have become a necessity to scale with the…

Databases · Computer Science 2023-11-28 Kristalys Ruiz-Rohena , Manuel Rodriguez-Martinez

Cloud Process Execution Engine: Architecture and Interfaces

Process Execution Engines are a vital part of Business Process Management (BPM) and Manufacturing Orchestration Management (MOM), as they allow the business or manufacturing logic (expressed in a graphical notation such as BPMN) to be…

Other Computer Science · Computer Science 2022-09-20 Juergen Mangler , Stefanie Rinderle-Ma

A Visual Analytics Framework for Reviewing Streaming Performance Data

Understanding and tuning the performance of extreme-scale parallel computing systems demands a streaming approach due to the computational cost of applying offline algorithms to vast amounts of performance log data. Analyzing large…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-28 Suraj P. Kesavan , Takanori Fujiwara , Jianping Kelvin Li , Caitlin Ross , Misbah Mubarak , Christopher D. Carothers , Robert B. Ross , Kwan-Liu Ma

Modularis: Modular Relational Analytics over Heterogeneous Distributed Platforms

The enormous quantity of data produced every day together with advances in data analytics has led to a proliferation of data management and analysis systems. Typically, these systems are built around highly specialized monolithic operators…

Databases · Computer Science 2021-09-30 Dimitrios Koutsoukos , Ingo Müller , Renato Marroquín , Ana Klimovic , Gustavo Alonso

Integrating Abstractions to Enhance the Execution of Distributed Applications

One of the factors that limits the scale, performance, and sophistication of distributed applications is the difficulty of concurrently executing them on multiple distributed computing resources. In part, this is due to a poor understanding…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-02-22 Matteo Turilli , Feng Liu , Zhao Zhang , Andre Merzky , Michael Wilde , Jon Weissman , Daniel S. Katz , Shantenu Jha