English
Related papers

Related papers: CODAG: Characterizing and Optimizing Decompression…

200 papers

Text analytics directly on compression (TADOC) has proven to be a promising technology for big data analytics. GPUs are extremely popular accelerators for data analytics systems. Unfortunately, no work so far shows how to utilize GPUs to…

Databases · Computer Science 2021-06-15 Feng Zhang , Zaifeng Pan , Yanliang Zhou , Jidong Zhai , Xipeng Shen , Onur Mutlu , Xiaoyong Du

GPUs are uniquely suited to accelerate (SQL) analytics workloads thanks to their massive compute parallelism and High Bandwidth Memory (HBM) -- when datasets fit in the GPU HBM, performance is unparalleled. Unfortunately, GPU HBMs remain…

Today's exponentially increasing data volumes and the high cost of storage make compression essential for the Big Data industry. Although research has concentrated on efficient compression, fast decompression is critical for analytics…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-03 Evangelia Sitaridi , Rene Mueller , Tim Kaldewey , Guy Lohman , Kenneth Ross

In GPU-accelerated data analytics, the overhead of data transfer from CPU to GPU becomes a performance bottleneck when the data scales beyond GPU memory capacity due to the limited PCIe bandwidth. Data compression has come to rescue for…

Databases · Computer Science 2026-02-10 Gwangoo Yeo , Zhiyang Shen , Wei Cui , Matteo Interlandi , Rathijit Sen , Bailu Ding , Qi Chen , Minsoo Rhu

GPUs rely on large register files to unlock thread-level parallelism for high throughput. Unfortunately, large register files are power hungry, making it important to seek for new approaches to improve their utilization. This paper…

Hardware Architecture · Computer Science 2020-12-10 Alexandra Angerd , Erik Sintorn , Per Stenström

In the last decades, the computational power of GPUs has grown exponentially, allowing current deep learning (DL) applications to handle increasingly large amounts of data at a progressively higher throughput. However, network and storage…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-08 Francesco Versaci , Giovanni Busonera

Huffman compression is a statistical, lossless, data compression algorithm that compresses data by assigning variable length codes to symbols, with the more frequently appearing symbols given shorter codes than the less. This work is a…

Information Theory · Computer Science 2011-07-11 R. L. Cloud , M. L. Curry , H. L. Ward , A. Skjellum , P. Bangalore

The JPEG compression format has been the standard for lossy image compression for over multiple decades, offering high compression rates at minor perceptual loss in image quality. For GPU-accelerated computer vision and deep learning tasks,…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-18 André Weißenberger , Bertil Schmidt

This paper provides an in-depth characterization of GPU-accelerated systems, to understand the interplay between overlapping computation and communication which is commonly employed in distributed training settings. Due to the large size of…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-08 Seonho Lee , Jihwan Oh , Junkyum Kim , Seokjin Go , Jongse Park , Divya Mahajan

As compared to a large spectrum of performance optimizations, relatively little effort has been dedicated to optimize other aspects of embedded applications such as memory space requirements, power, real-time predictability, and…

Other Computer Science · Computer Science 2011-11-09 O. Ozturk , H. Saputra , M. Kandemir , I. Kolcu

Motivated by the imperative for real-time responsiveness and data privacy preservation, large language models (LLMs) are increasingly deployed on resource-constrained edge devices to enable localized inference. To improve output quality,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-11 Guihang Hong , Tao Ouyang , Kongyange Zhao , Zhi Zhou , Xu Chen

In the rapidly advancing field of image generation, Visual Auto-Regressive (VAR) modeling has garnered considerable attention for its innovative next-scale prediction approach. This paradigm offers substantial improvements in efficiency,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-28 Zigeng Chen , Xinyin Ma , Gongfan Fang , Xinchao Wang

Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data, achieving ground-breaking performance on various tasks such as node classification and graph property prediction.…

Machine Learning · Computer Science 2021-12-17 Tianfeng Liu , Yangrui Chen , Dan Li , Chuan Wu , Yibo Zhu , Jun He , Yanghua Peng , Hongzheng Chen , Hongzhi Chen , Chuanxiong Guo

GPUs offer orders-of-magnitude higher memory bandwidth than traditional CPU-only systems. However, GPU device memory tends to be relatively small and the memory capacity can not be increased by the user. This paper describes Buddy…

Hardware Architecture · Computer Science 2019-04-17 Esha Choukse , Michael Sullivan , Mike O'Connor , Mattan Erez , Jeff Pool , David Nellans , Steve Keckler

Scientific applications produce vast amounts of data, posing grand challenges in the underlying data management and analytic tasks. Progressive compression is a promising way to address this problem, as it allows for on-demand data…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-02 Yanliang Li , Wenbo Li , Qian Gong , Qing Liu , Norbert Podhorszki , Scott Klasky , Xin Liang , Jieyang Chen

Dynamic parallelism on GPUs allows GPU threads to dynamically launch other GPU threads. It is useful in applications with nested parallelism, particularly where the amount of nested parallelism is irregular and cannot be predicted…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-01-11 Mhd Ghaith Olabi , Juan Gómez Luna , Onur Mutlu , Wen-mei Hwu , Izzat El Hajj

We introduce RAGE, an image compression framework that achieves four generally conflicting objectives: 1) good compression for a wide variety of color images, 2) computationally efficient, fast decompression, 3) fast random access of images…

Image and Video Processing · Electrical Eng. & Systems 2024-02-12 Christian D. Rask , Daniel E. Lucani

GPUs have been widely used to accelerate computations exhibiting simple patterns of parallelism - such as flat or two-level parallelism - and a degree of parallelism that can be statically determined based on the size of the input dataset.…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-18 Hancheng Wu , Da Li , Michela Becchi

Graphics Processing Units allow for running massively parallel applications offloading the CPU from computationally intensive resources, however GPUs have a limited amount of memory. In this paper a trie compression algorithm for massively…

Data Structures and Algorithms · Computer Science 2017-02-20 Xavier Bellekens , Amar Seeam , Christos Tachtatzis , Robert Atkinson

Deep Learning system architects strive to design a balanced system where the computational accelerator -- FPGA, GPU, etc, is not starved for data. Feeding training data fast enough to effectively keep the accelerator utilization high is…

Performance · Computer Science 2018-12-04 Christian Pinto , Yiannis Gkoufas , Andrea Reale , Seetharami Seelam , Steven Eliuk
‹ Prev 1 2 3 10 Next ›