Related papers: Mapping Datasets to Object Storage System

Towards an Arrow-native Storage System

With the ever-increasing dataset sizes, several file formats like Parquet, ORC, and Avro have been developed to store data efficiently and to save network and interconnect bandwidth at the price of additional CPU utilization. However, with…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-24 Jayjeet Chakraborty , Ivo Jimenez , Sebastiaan Alvarez Rodriguez , Alexandru Uta , Jeff LeFevre , Carlos Maltzahn

Skyhook: Towards an Arrow-Native Storage System

With the ever-increasing dataset sizes, several file formats such as Parquet, ORC, and Avro have been developed to store data efficiently, save the network, and interconnect bandwidth at the price of additional CPU utilization. However,…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-04-14 Jayjeet Chakraborty , Ivo Jimenez , Sebastiaan Alvarez Rodriguez , Alexandru Uta , Jeff LeFevre , Carlos Maltzahn

DAOS for Extreme-scale Systems in Scientific Applications

Exascale I/O initiatives will require new and fully integrated I/O models which are capable of providing straightforward functionality, fault tolerance and efficiency. One solution is the Distributed Asynchronous Object Storage (DAOS)…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-12-04 M. Scot Breitenfeld , Neil Fortner , Jordan Henderson , Jerome Soumagne , Mohamad Chaarawi , Johann Lombardi , Quincey Koziol

Using Object-Relational Mapping to Create the Distributed Databases in a Hybrid Cloud Infrastructure

One of the challenges currently problems in the use of cloud services is the task of designing of specialized data management systems. This is especially important for hybrid systems in which the data are located in public and private…

Databases · Computer Science 2015-01-06 Oleg Lukyanchikov , Evgeniy Pluzhnik , Simon Payain , Evgeny Nikulchev

Enabling Object Storage via shims for Grid Middleware

The Object Store model has quickly become the basis of most commercially successful mass storage infrastructure, backing so-called "Cloud" storage such as Amazon S3, but also underlying the implementation of most parallel distributed…

Computational Physics · Physics 2016-01-20 Samuel Cadellin Skipsey , Shaun De Witt , Alastair Dewhurst , David Britton , Gareth Roy , David Crooks

Distributed and Big Data Storage Management in Grid Computing

Big data storage management is one of the most challenging issues for Grid computing environments, since large amount of data intensive applications frequently involve a high degree of data access locality. Grid applications typically deal…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-07-13 Ajay Kumar , Seema Bawa

Self-Evolving Distributed Memory Architecture for Scalable AI Systems

Distributed AI systems face critical memory management challenges across computation, communication, and deployment layers. RRAM based in memory computing suffers from scalability limitations due to device non idealities and fixed array…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-19 Zixuan Li , Chuanzhen Wang , Haotian Sun

Scaling Shared-Memory Data Structures as Distributed Global-View Data Structures in the Partitioned Global Address Space model

The Partitioned Global Address Space (PGAS), a memory model in which the global address space is explicitly partitioned across compute nodes in a cluster, strives to bridge the gap between shared-memory and distributed-memory programming.…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-02 Garvit Dewan , Louis Jenkins

Distributed Ranges: A Model for Distributed Data Structures, Algorithms, and Views

Data structures and algorithms are essential building blocks for programs, and \emph{distributed data structures}, which automatically partition data across multiple memory locales, are essential to writing high-level parallel programs.…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-06 Benjamin Brock , Robert Cohn , Suyash Bakshi , Tuomas Karna , Jeongnim Kim , Mateusz Nowak , Łukasz Ślusarczyk , Kacper Stefanski , Timothy G. Mattson

Navigating the Landscape of Distributed File Systems: Architectures, Implementations, and Considerations

Distributed File Systems (DFS) have emerged as sophisticated solutions for efficient file storage and management across interconnected computer nodes. The main objective of DFS is to achieve flexible, scalable, and resilient file storage…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-09 Xueting Pan , Ziqian Luo , Lisang Zhou

Deploying Customized Data Representation and Approximate Computing in Machine Learning Applications

Major advancements in building general-purpose and customized hardware have been one of the key enablers of versatility and pervasiveness of machine learning models such as deep neural networks. To sustain this ubiquitous deployment of…

Machine Learning · Computer Science 2018-06-05 Mahdi Nazemi , Massoud Pedram

Towards a decentralized algorithm for mapping network and computational resources for distributed data-flow computations

Several high-throughput distributed data-processing applications require multi-hop processing of streams of data. These applications include continual processing on data streams originating from a network of sensors, composing a multimedia…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-03-26 Shah Asaduzzaman , Muthucumaru Maheswaran

DMAPF: A Decentralized and Distributed Solver for Multi-Agent Path Finding Problem with Obstacles

Multi-Agent Path Finding (MAPF) is a problem of finding a sequence of movements for agents to reach their assigned location without collision. Centralized algorithms usually give optimal solutions, but have difficulties to scale without…

Multiagent Systems · Computer Science 2021-09-20 Poom Pianpak , Tran Cao Son

DeepMapping: Learned Data Mapping for Lossless Compression and Efficient Lookup

Storing tabular data to balance storage and query efficiency is a long-standing research question in the database community. In this work, we argue and show that a novel DeepMapping abstraction, which relies on the impressive memorization…

Databases · Computer Science 2024-09-27 Lixi Zhou , K. Selçuk Candan , Jia Zou

Distributed Management of Massive Data: an Efficient Fine-Grain Data Access Scheme

This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of…

Distributed, Parallel, and Cluster Computing · Computer Science 2008-10-14 Bogdan Nicolae , Gabriel Antoniu , Luc Bougé

Distributed Data Placement via Graph Partitioning

With the widespread use of shared-nothing clusters of servers, there has been a proliferation of distributed object stores that offer high availability, reliability and enhanced performance for MapReduce-style workloads. However, relational…

Databases · Computer Science 2013-12-03 Lukasz Golab , Marios Hadjieleftheriou , Howard Karloff , Barna Saha

Exploring DAOS Interfaces and Performance

Distributed Asynchronous Object Store (DAOS) is a novel software-defined object store leveraging Non-Volatile Memory (NVM) devices, designed for high performance. It provides a number of interfaces for applications to undertake I/O, ranging…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-30 Nicolau Manubens , Johann Lombardi , Simon D. Smart , Emanuele Danovaro , Tiago Quintino , Dean Hildebrand , Adrian Jackson

Towards Offloadable and Migratable Microservices on Disaggregated Architectures: Vision, Challenges, and Research Roadmap

Microservice and serverless computing systems open up massive versatility and opportunity to distributed and datacenter-scale computing. In the meantime, the deployments of modern datacenter resources are moving to disaggregated…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-26 Xiaoyi Lu , Arjun Kashyap

Self-healing Nodes with Adaptive Data-Sharding

Data sharding, a technique for partitioning and distributing data among multiple servers or nodes, offers enhancements in the scalability, performance, and fault tolerance of extensive distributed systems. Nonetheless, this strategy…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-02 Ayush Thakur , Sanskar Chauhan , Ilisha Tomar , Vaibhavi Paul , Deepak Gupta

Design and Experimental Evaluation of Algorithms for Optimizing the Throughput of Dispersed Computing

With growing deployment of Internet of Things (IoT) and machine learning (ML) applications, which need to leverage computation on edge and cloud resources, it is important to develop algorithms and tools to place these distributed…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-30 Xiangchen Zhao , Diyi Hu , Bhaskar Krishnamachari