Related papers: A Universal Parallel Two-Pass MDL Context Tree Com…

A Parallel Two-Pass MDL Context Tree Algorithm for Universal Source Coding

We present a novel lossless universal source coding algorithm that uses parallel computational units to increase the throughput. The length-$N$ input sequence is partitioned into $B$ blocks. Processing each block independently of the other…

Information Theory · Computer Science 2016-03-27 Nikhil Krishnan , Dror Baron , Mehmet Kıvanç Mıhçak

Bilateral Distribution Compression: Reducing Both Data Size and Dimensionality

Existing distribution compression methods reduce the number of observations in a dataset by minimising the Maximum Mean Discrepancy (MMD) between original and compressed sets, but modern datasets are often large in both sample size and…

Machine Learning · Statistics 2026-01-28 Dominic Broadbent , Nick Whiteley , Robert Allison , Tom Lovett

A New Lossless Data Compression Algorithm Exploiting Positional Redundancy

A new run length encoding algorithm for lossless data compression that exploits positional redundancy by representing data in a two-dimensional model of concentric circles is presented. This visual transform enables detection of runs (each…

Data Structures and Algorithms · Computer Science 2021-07-30 Pranav Venkatram

Hypersuccinct Trees -- New universal tree source codes for optimal compressed tree data structures and range minima

We present a new universal source code for distributions of unlabeled binary and ordinal trees that achieves optimal compression to within lower order terms for all tree sources covered by existing universal codes. At the same time, it…

Data Structures and Algorithms · Computer Science 2021-09-06 J. Ian Munro , Patrick K. Nicholson , Louisa Seelbach Benkner , Sebastian Wild

Results on the Fundamental Gain of Memory-Assisted Universal Source Coding

Many applications require data processing to be performed on individual pieces of data which are of finite sizes, e.g., files in cloud storage units and packets in data networks. However, traditional universal compression solutions would…

Information Theory · Computer Science 2012-05-22 Ahmad Beirami , Mohsen Sardari , Faramarz Fekri

Practical Parallel Block Tree Construction: First Results

The block tree [Belazzougui et al., J. Comput. Syst. Sci. '21] is a compressed representation of a length-$n$ text that supports access, rank, and select queries while requiring only $O(z\log\frac{n}{z})$ words of space, where $z$ is the…

Data Structures and Algorithms · Computer Science 2025-12-30 Robert Clausecker , Florian Kurpicz , Etienne Palanga

Parallel Recursive State Compression for Free

This paper focuses on reducing memory usage in enumerative model checking, while maintaining the multi-core scalability obtained in earlier work. We present a tree-based multi-core compression method, which works by leveraging sharing among…

Data Structures and Algorithms · Computer Science 2011-05-17 Alfons Laarman , Jaco van de Pol , Michael Weber

Parallel Batch-Dynamic $k$d-Trees

$k$d-trees are widely used in parallel databases to support efficient neighborhood/similarity queries. Supporting parallel updates to $k$d-trees is therefore an important operation. In this paper, we present BDL-tree, a parallel,…

Data Structures and Algorithms · Computer Science 2021-12-14 Rahul Yesantharao , Yiqiu Wang , Laxman Dhulipala , Julian Shun

Study On Universal Lossless Data Compression by using Context Dependence Multilevel Pattern Matching Grammar Transform

In this paper, the context dependence multilevel pattern matching(in short CDMPM) grammar transform is proposed; based on this grammar transform, the universal lossless data compression algorithm, CDMPM code is then developed. Moreover we…

Discrete Mathematics · Computer Science 2013-03-21 Chung-Song Kim , Chol-Hun Kim

Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data

In-context learning has established itself as an important learning paradigm for Large Language Models (LLMs). In this paper, we demonstrate that LLMs can learn encoding keys in-context and perform analysis directly on encoded…

Computation and Language · Computer Science 2026-04-16 Andresa Rodrigues de Campos , David Lee , Imry Kissos , Piyush Paritosh

Massively-Parallel Lossless Data Decompression

Today's exponentially increasing data volumes and the high cost of storage make compression essential for the Big Data industry. Although research has concentrated on efficient compression, fast decompression is critical for analytics…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-03 Evangelia Sitaridi , Rene Mueller , Tim Kaldewey , Guy Lohman , Kenneth Ross

MST-compression: Compressing and Accelerating Binary Neural Networks with Minimum Spanning Tree

Binary neural networks (BNNs) have been widely adopted to reduce the computational cost and memory storage on edge-computing devices by using one-bit representation for activations and weights. However, as neural networks become…

Computer Vision and Pattern Recognition · Computer Science 2023-08-29 Quang Hieu Vo , Linh-Tam Tran , Sung-Ho Bae , Lok-Won Kim , Choong Seon Hong

A Distributed Parallel Algorithm for Minimum Spanning Tree Problem

In this paper we present and evaluate a parallel algorithm for solving a minimum spanning tree (MST) problem for supercomputers with distributed memory. The algorithm relies on the relaxation of the message processing order requirement for…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-18 Artem Mazeev , Alexander Semenov , Alexey Simonov

Optimizing run-length algorithm using octonary repetition tree

Compression is beneficial because it helps detract resource usage. It reduces data storage space as well as transmission traffic and improves web pages loading. Run-length coding (RLC) is a lossless data compression algorithm. Data are…

Data Structures and Algorithms · Computer Science 2016-11-30 Kaveh Geyratmand Haghighi , Mirkamal Mirnia , Ahmad Habibizad Navin

On Finite Memory Universal Data Compression and Classification of Individual Sequences

Consider the case where consecutive blocks of N letters of a semi-infinite individual sequence X over a finite-alphabet are being compressed into binary sequences by some one-to-one mapping. No a-priori information about X is available at…

Information Theory · Computer Science 2013-01-25 Jacob Ziv

Memory-Assisted Universal Compression of Network Flows

Recently, the existence of considerable amount of redundancy in the Internet traffic has stimulated the deployment of several redundancy elimination techniques within the network. These techniques are often based on either packet-level…

Information Theory · Computer Science 2012-04-02 Mohsen Sardari , Ahmad Beirami , Faramarz Fekri

Memory-Assisted Universal Source Coding

The problem of the universal compression of a sequence from a library of several small to moderate length sequences from similar context arises in many practical scenarios, such as the compression of the storage data and the Internet…

Information Theory · Computer Science 2015-03-20 Ahmad Beirami , Faramarz Fekri

Network Compression: Memory-Assisted Universal Coding of Sources with Correlated Parameters

In this paper, we propose {\em distributed network compression via memory}. We consider two spatially separated sources with correlated unknown source parameters. We wish to study the universal compression of a sequence of length $n$ from…

Information Theory · Computer Science 2012-10-09 Ahmad Beirami , Faramarz Fekri

BS-tree: A gapped data-parallel B-tree

We propose BS-tree, an in-memory implementation of the B+-tree that adopts the structure of the disk-based index (i.e., a balanced, multiway tree), setting the node size to a memory block that can be processed fast and in parallel using…

Databases · Computer Science 2025-11-14 Dimitrios Tsitsigkos , Achilleas Michalopoulos , Nikos Mamoulis , Manolis Terrovitis

Efficient parallel CP decomposition with pairwise perturbation and multi-sweep dimension tree

CP tensor decomposition with alternating least squares (ALS) is dominated in cost by the matricized-tensor times Khatri-Rao product (MTTKRP) kernel that is necessary to set up the quadratic optimization subproblems. State-of-art parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-26 Linjian Ma , Edgar Solomonik