English
Related papers

Related papers: A Parallel Data Compression Framework for Large Sc…

200 papers

Increasing data volumes from scientific simulations and instruments (supercomputers, accelerators, telescopes) often exceed network, storage, and analysis capabilities. The scientific community's response to this challenge is scientific…

Modern scientific simulations and instruments generate data volumes that overwhelm memory and storage, throttling scalability. Lossy compression mitigates this by trading controlled error for reduced footprint and throughput gains, yet…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-26 Skyler Ruiter , Jiannan Tian , Fengguang Song

Data compression is becoming critical for storing scientific data because many scientific applications need to store large amounts of data and post process this data for scientific discovery. Unlike image and video compression algorithms…

Machine Learning · Computer Science 2022-12-22 Tania Banerjee , Jong Choi , Jaemoon Lee , Qian Gong , Jieyang Chen , Scott Klasky , Anand Rangarajan , Sanjay Ranka

Particle-based simulations and point-cloud applications generate massive, irregular datasets that challenge storage, I/O, and real-time analytics. Traditional compression techniques struggle with irregular particle distributions and GPU…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-15 Ruoyu Li , Yafan Huang , Longtao Zhang , Zhuoxun Yang , Sheng Di , Jiajun Huang , Jinyang Liu , Jiannan Tian , Xin Liang , Guanpeng Li , Hanqi Guo , Franck Cappello , Kai Zhao

Today's scientific high performance computing (HPC) applications or advanced instruments are producing vast volumes of data across a wide range of domains, which introduces a serious burden on data transfer and storage. Error-bounded lossy…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-02-01 Xiaodong Yu , Sheng Di , Kai Zhao , jiannan Tian , Dingwen Tao , Xin Liang , Franck Cappello

Many scientific data sets contain temporal dimensions. These are the data storing information at the same spatial location but different time stamps. Some of the biggest temporal datasets are produced by parallel computing applications such…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-03-13 Zheng Yuan , William Hendrix , Seung Woo Son , Christoph Federrath , Ankit Agrawal , Wei-keng Liao , Alok Choudhary

Modern HPC applications produce increasingly large amounts of data, which limits the performance of current extreme-scale systems. Data reduction techniques, such as lossy compression, help to mitigate this issue by decreasing the size of…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-01-13 Griffin Dube , Jiannan Tian , Sheng Di , Dingwen Tao , Jon Calhoun , Franck Cappello

Large-scale simulations of time-dependent problems generate a massive amount of data and with the explosive increase in computational resources the size of the data generated by these simulations has increased significantly. This has…

Computational Engineering, Finance, and Science · Computer Science 2022-01-19 Shaghayegh Zamani Ashtiani , Mujeeb R. Malik , Hessam Babaee

With endless amounts of data and very limited bandwidth, fast data compression is one solution for the growing datasharing problem. Compression helps lower transfer times and save memory, but if the compression takes too long, this no…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-21 David Noel , Elizabeth Graham , Liyuan Liu

Modern scientific simulations, observations, and large-scale experiments generate data at volumes that often exceed the limits of storage, processing, and analysis. This challenge drives the development of data reduction methods that…

Machine Learning · Computer Science 2025-11-18 Minh Vu , Andrey Lokhov

With the ever-increasing execution scale of high performance computing (HPC) applications, vast amounts of data are being produced by scientific research every day. Error-bounded lossy compression has been considered a very promising…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-24 Jinyang Liu , Sheng Di , Kai Zhao , Xin Liang , Zizhong Chen , Franck Cappello

With ever-increasing volumes of scientific floating-point data being produced by high-performance computing applications, significantly reducing scientific floating-point data size is critical, and error-controlled lossy compressors have…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-20 Robert Underwood , Sheng Di , Jon C. Calhoun , Franck Cappello

Because of the vast volume of data being produced by today's scientific simulations, lossy compression allowing user-controlled information loss can significantly reduce the data size and the I/O burden. However, for large-scale cosmology…

Information Theory · Computer Science 2017-08-08 Dingewn Tao , Sheng Di , Zizhong Chen , Franck Cappello

In general, large datasets enable deep learning models to perform with good accuracy and generalizability. However, massive high-fidelity simulation datasets (from molecular chemistry, astrophysics, computational fluid dynamics (CFD), etc.…

Machine Learning · Computer Science 2022-07-27 Wai Tong Chung , Ki Sung Jung , Jacqueline H. Chen , Matthias Ihme

Nowadays simulations can produce petabytes of data to be stored in parallel filesystems or large-scale databases. This data is accessed over the course of decades often by thousands of analysts and scientists. However, storing these volumes…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-11 Salvatore Di Girolamo , Pirmin Schmid , Thomas Schulthess , Torsten Hoefler

Error-bounded lossy compression is one of the most efficient solutions to reduce the volume of scientific data. For lossy compression, progressive decompression and random-access decompression are critical features that enable on-demand…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-03 Daoce Wang , Pascal Grosset , Jesus Pulido , Jiannan Tian , Tushar M. Athawale , Jinda Jia , Baixi Sun , Boyuan Zhang , Sian Jin , Kai Zhao , James Ahrens , Fengguang Song

Present day computational fluid dynamics simulations generate extremely large amounts of data, sometimes on the order of TB/s. Often, a significant fraction of this data is discarded because current storage systems are unable to keep pace.…

Computational Engineering, Finance, and Science · Computer Science 2021-03-03 Heather Pacella , Alec Dunton , Alireza Doostan , Gianluca Iaccarino

Today's exponentially increasing data volumes and the high cost of storage make compression essential for the Big Data industry. Although research has concentrated on efficient compression, fast decompression is critical for analytics…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-03 Evangelia Sitaridi , Rene Mueller , Tim Kaldewey , Guy Lohman , Kenneth Ross

Today's large-scale scientific applications running on high-performance computing (HPC) systems generate vast data volumes. Thus, data compression is becoming a critical technique to mitigate the storage burden and data-movement cost.…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-04 Boyuan Zhang , Jiannan Tian , Sheng Di , Xiaodong Yu , Yunhe Feng , Xin Liang , Dingwen Tao , Franck Cappello

Scientific discoveries are increasingly constrained by limited storage space and I/O capacities. For time-series simulations and experiments, their data often need to be decimated over timesteps to accommodate storage and I/O limitations.…

‹ Prev 1 2 3 10 Next ›