Related papers: RStore: A Distributed Multi-version Document Store
Motivated by applications of distributed storage systems to cloud-based key-value stores, the multi-version coding problem has been recently formulated to efficiently store frequently updated data in asynchronous decentralized storage…
The relative ease of collaborative data science and analysis has led to a proliferation of many thousands or millions of $versions$ of the same datasets in many scientific and commercial domains, acquired or constructed at various stages of…
Motivated by applications of distributed storage systems to key-value stores, the multi-version coding problem was formulated to efficiently store frequently updated data in asynchronous decentralized storage systems. Inspired by…
The ability to store multiple versions of a data item is a powerful primitive that has had a wide variety of uses: relational databases, transactional memory, version control systems, to name a few. However, each implementation uses a very…
Today's storage systems expose abstractions which are either too low-level (e.g., key-value store, raw-block store) that they require developers to re-invent the wheels, or too high-level (e.g., relational databases, Git) that they lack…
We present VStore, a data store for supporting fast, resource-efficient analytics over large archival videos. VStore manages video ingestion, storage, retrieval, and consumption. It controls video formats along the video data path. It is…
Dynamic graph storage systems are essential for real-time applications such as social networks and recommendation, where graph data continuously evolves. However, they face significant challenges in efficiently handling concurrent read and…
Data distribution across different facilities offers benefits such as enhanced resource utilization, increased resilience through replication, and improved performance by processing data near its source. However, managing such data is…
To accommodate the needs of large-scale distributed P2P systems, scalable data management strategies are required, allowing applications to efficiently cope with continuously growing, highly dis tributed data. This paper addresses the…
In applications of distributed storage systems to distributed computing and implementation of key- value stores, the following property, usually referred to as consistency in computer science and engineering, is an important requirement: as…
Distributed Hash Tables offer a resilient lookup service for unstable distributed environments. Resilient data storage, however, requires additional data replication and maintenance algorithms. These algorithms can have an impact on both…
In applications of distributed storage systems to modern key-value stores, the stored data is highly dynamic due to frequent updates. The multi-version coding problem was formulated to study the cost of storing dynamic data in distributed…
As applications continue to generate multi-dimensional data at exponentially increasing rates, fast analytics to extract meaningful results is becoming extremely important. The database community has developed array databases that alleviate…
Fault-tolerant distributed applications require mechanisms to recover data lost via a process failure. On modern cluster systems it is typically impractical to request replacement resources after such a failure. Therefore, applications have…
Many repositories utilize the versatile RDF model to publish data. Repositories are typically distributed and geographically remote, but data are interconnected (e.g., the Semantic Web) and queried globally by a language such as SPARQL. Due…
We consider the problem of Robust Dynamic Coded Distributed Storage (RDCDS) with partially storage constrained servers where the goal is to enable robust (resilient to server dropouts) and efficient (as measured by the communication costs)…
A distributed storage system (DSS) needs to be efficiently accessible and repairable. Recently, considerable effort has been made towards the latter, while the former is usually not considered, since a trivial solution exists in the form of…
This study proposes a novel storage engine, SynchroStore, designed to address the inefficiency of update operations in columnar storage systems based on Log-Structured Merge Trees (LSM-Trees) under hybrid workload scenarios. While columnar…
Distributed storage systems with replication are well known for storing large amount of data. A large number of replication is done in order to provide reliability. This makes the system expensive. Various methods have been proposed over…
The continuously increasing amount of digital data generated by today's society asks for better storage solutions. This survey looks at a new generation of coding techniques designed specifically for the needs of distributed networked…