Related papers: Optimal Compression for Two-Field Entries in Fixed…
Constrained codes are used to prevent errors from occurring in various data storage and data transmission systems. They can help in increasing the storage density of magnetic storage devices, in managing the lifetime of electronic storage…
Today, with the growing demands of information storage and data transfer, data compression is becoming increasingly important. Data Compression is a technique which is used to decrease the size of data. This is very useful when some huge…
Data compression has been widely applied in many data processing areas. Compression methods use variable-size codes with the shorter codes assigned to symbols or groups of symbols that appear in the data frequently. Fibonacci coding, as a…
Recent developments in storage -- especially in the area of resistive random access memory (ReRAM) -- are attempting to scale the storage density by regarding the information data as two-dimensional (2D), instead of one-dimensional (1D).…
Compressed indexing is a powerful technique that enables efficient querying over data stored in compressed form, significantly reducing memory usage and often accelerating computation. While extensive progress has been made for…
Compressing integer keys is a fundamental operation among multiple communities, such as database management (DB), information retrieval (IR), and high-performance computing (HPC). Recent advances in \emph{learned indexes} have inspired the…
This paper studies two crucial problems in the context of coded distributed storage systems directly related to their performance: 1) for a fixed alphabet size, determine the minimum number of servers the system must have for its service…
Optimizing distributed learning systems is an art of balancing between computation and communication. There have been two lines of research that try to deal with slower networks: {\em communication compression} for low bandwidth networks,…
As compared to a large spectrum of performance optimizations, relatively little effort has been dedicated to optimize other aspects of embedded applications such as memory space requirements, power, real-time predictability, and…
We examine the problem of allocating a given total storage budget in a distributed storage system for maximum reliability. A source has a single data object that is to be coded and stored over a set of storage nodes; it is allowed to store…
This paper considers the problem of distributed source coding for a large network. A major obstacle that poses an existential threat to practical deployment of conventional approaches to distributed coding is the exponential growth of the…
Data compression is an efficient technique to save data storage and transmission costs. However, traditional data compression methods always ignore the impact of user preferences on the statistical distributions of symbols transmitted over…
In-memory columnar databases have become mainstream over the last decade and have vastly improved the fast processing of large volumes of data through multi-core parallelism and in-memory compression thereby eliminating the usual…
We study the problem of efficient compression of a stochastic source of probability distributions. It can be viewed as a generalization of Shannon's source coding problem. It has relation to the theory of common randomness, as well as to…
We consider the problem of optimally compressing and caching data across a communication network. Given the data generated at edge nodes and a routing path, our goal is to determine the optimal data compression ratios and caching decisions…
We introduce a new family of compressed data structures to efficiently store and query large string dictionaries in main memory. Our main technique is a combination of hierarchical Front-coding with ideas from longest-common-prefix…
Compression algorithms reduce the redundancy in data representation to decrease the storage required for that data. Data compression offers an attractive approach to reducing communication costs by using available bandwidth effectively.…
Constrained coding plays a key role in optimizing performance and mitigating errors in applications such as storage and communication, where specific constraints on codewords are required. While non-parametric constraints have been…
We examine the problem of creating an encoded distributed storage representation of a data object for a network of mobile storage nodes so as to achieve the optimal recovery delay. A source node creates a single data object and disseminates…
For storing a word or the whole text segment, we need a huge storage space. Typically a character requires 1 Byte for storing it in memory. Compression of the memory is very important for data management. In case of memory requirement…