Related papers: Partition Reduction for Lossy Data Compression Pro…
The amount of data generated and gathered in scientific simulations and data collection applications is continuously growing, putting mounting pressure on storage and bandwidth concerns. A means of reducing such issues is data compression;…
This paper presents error-bounded lossy compression tailored for particle datasets from diverse scientific applications in cosmology, fluid dynamics, and fusion energy sciences. As today's high-performance computing capabilities advance,…
Rapidly increasing data sizes in scientific computing are the driving force behind the need for lossy compression. The main drawback of lossy data compression is the introduction of error. This paper explains why many error-bounded…
Entropy and free-energy estimation are key in thermodynamic characterization of simulated systems ranging from spin models through polymers, colloids, protein structure, and drug-design. Current techniques suffer from being model specific,…
Today's HPC applications are producing extremely large amounts of data, such that data storage and analysis are becoming more challenging for scientific research. In this work, we design a new error-controlled lossy compression algorithm…
An alternative approach to two-part 'critical compression' is presented. Whereas previous results were based on summing a lossless code at reduced precision with a lossy-compressed error or noise term, the present approach uses a similar…
Lossy compressors are increasingly adopted in scientific research, tackling volumes of data from experiments or parallel numerical simulations and facilitating data storage and movement. In contrast with the notion of entropy in lossless…
Can we use machine learning to compress graph data? The absence of ordering in graphs poses a significant challenge to conventional compression algorithms, limiting their attainable gains as well as their ability to discover relevant…
Machine learning has had a major impact on data compression over the last decade and inspired many new, exciting theoretical and applied questions. This paper describes one such direction -- relative entropy coding -- which focuses on…
Our increasingly digital and connected world has led to the generation of unprecedented amounts of data. This data must be efficiently managed, transmitted, and stored to preserve resources and allow scalability. Data compression has…
Today's scientific simulations, for example in the high-performance exascale sector, produce huge amounts of data. Due to limited I/O bandwidth and available storage space, there is the necessity to reduce scientific data of high…
Comparison-based algorithms are algorithms for which the execution of each operation is solely based on the outcome of a series of comparisons between elements. Comparison-based computations can be naturally represented via the following…
In this paper we investigate the problem of partitioning an input string T in such a way that compressing individually its parts via a base-compressor C gets a compressed output that is shorter than applying C over the entire T at once.…
Many scientific applications opt for particles instead of meshes as their basic primitives to model complex systems composed of billions of discrete entities. Such applications span a diverse array of scientific domains, including molecular…
We formulate the problem of performing optimal data compression under the constraints that compressed data can be used for accurate classification in machine learning. We show that this translates to a problem of minimizing the mutual…
We consider a natural generalization of the Partial Vertex Cover problem. Here an instance consists of a graph G = (V,E), a positive cost function c: V-> Z^{+}, a partition $P_1,..., P_r$ of the edge set $E$, and a parameter $k_i$ for each…
Increasing data volumes from scientific simulations and instruments (supercomputers, accelerators, telescopes) often exceed network, storage, and analysis capabilities. The scientific community's response to this challenge is scientific…
We present herein a scheme by which to accurately evaluate the error exponents of a lossy data compression problem, which characterize average probabilities over a code ensemble of compression failure and success above or below a critical…
Compression of floating-point data will play an important role in high-performance computing as data bandwidth and storage become dominant costs. Lossy compression of floating-point data is powerful, but theoretical results are needed to…
The unfolding problem formulation for correcting experimental data distortions due to finite resolution and limited detector acceptance is discussed. A novel validation of the problem solution is proposed. Attention is drawn to fact that…