Related papers: Benchmarking the Graphulo Processing Framework
Big data and the Internet of Things era continue to challenge computational systems. Several technology solutions such as NoSQL databases have been developed to deal with this challenge. In order to generate meaningful results from large…
The Apache Accumulo database excels at distributed storage and indexing and is ideally suited for storing graph data. Many big data analytics compute on graph data and persist their results back to the database. These graph calculations are…
Google BigTable's scale-out design for distributed key-value storage inspired a generation of NoSQL databases. Recently the NewSQL paradigm emerged in response to analytic workloads that demand distributed computation local to data storage.…
Triangle counting is a key algorithm for large graph analysis. The Graphulo library provides a framework for implementing graph algorithms on the Apache Accumulo distributed database. In this work we adapt two algorithms for counting…
Complex networks are relational data sets commonly represented as graphs. The analysis of their intricate structure is relevant to many areas of science and commerce, and data sets may reach sizes that require distributed storage and…
Streaming, big data applications face challenges in creating scalable data flow pipelines, in which multiple data streams must be collected, stored, queried, and analyzed. These data sources are characterized by their volume (in terms of…
We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the…
The analysis of graphs has become increasingly important to a wide range of applications. Graph analysis presents a number of unique challenges in the areas of (1) software complexity, (2) data complexity, (3) security, (4) mathematical…
The GraphBLAS standard (GraphBlas.org) is being developed to bring the potential of matrix based graph algorithms to the broadest possible audience. Mathematically the Graph- BLAS defines a core set of matrix-based graph operations that can…
We present a graph processing benchmark suite with the goal of helping to standardize graph processing evaluations. Fewer differences between graph processing evaluations will make it easier to compare different research efforts and…
Graph algorithms and techniques are increasingly being used in scientific and commercial applications to express relations and explore large data sets. Although conventional or commodity computer architectures, like CPU or GPU, can compute…
Graph clustering has many important applications in computing, but due to growing sizes of graphs, even traditionally fast clustering methods such as spectral partitioning can be computationally expensive for real-world graphs of interest.…
The Apache Accumulo database is an open source relaxed consistency database that is widely used for government applications. Accumulo is designed to deliver high performance on unstructured data such as graphs of network data. This paper…
Inspired by the success of Google's Pregel, many systems have been developed recently for iterative computation over big graphs. These systems provide a user-friendly vertex-centric programming interface, where a programmer only needs to…
Deep recommender systems rely heavily on large embedding tables to handle high-cardinality categorical features such as user/item identifiers, and face significant memory constraints at scale. To tackle this challenge, hashing techniques…
GraphQL is a query language for APIs and a runtime to execute queries. Using GraphQL queries, clients define precisely what data they wish to retrieve or mutate on a server, leading to fewer round trips and reduced response sizes. Although…
In the recent years it can be observed increasing popularity of parallel processing using multi-core processors, local clusters, GPU and others. Moreover, currently one of the main requirements the IT users is the reduction of maintaining…
Graph processing systems are essential for analyzing large-scale data with complex relationships, yet most existing frameworks rely on statically provisioned clusters, resulting in poor elasticity and inefficient resource utilization under…
Processing large complex networks recently attracted considerable interest. Complex graphs are useful in a wide range of applications from technological networks to biological systems like the human brain. Sometimes these networks are…
High-performance implementations of graph algorithms are challenging to implement on new parallel hardware such as GPUs because of three challenges: (1) the difficulty of coming up with graph building blocks, (2) load imbalance on parallel…