English

GrateTile: Efficient Sparse Tensor Tiling for CNN Processing

Machine Learning 2020-09-21 v1 Hardware Architecture Machine Learning

Abstract

We propose GrateTile, an efficient, hardwarefriendly data storage scheme for sparse CNN feature maps (activations). It divides data into uneven-sized subtensors and, with small indexing overhead, stores them in a compressed yet randomly accessible format. This design enables modern CNN accelerators to fetch and decompressed sub-tensors on-the-fly in a tiled processing manner. GrateTile is suitable for architectures that favor aligned, coalesced data access, and only requires minimal changes to the overall architectural design. We simulate GrateTile with state-of-the-art CNNs and show an average of 55% DRAM bandwidth reduction while using only 0.6% of feature map size for indexing storage.

Cite

@article{arxiv.2009.08685,
  title  = {GrateTile: Efficient Sparse Tensor Tiling for CNN Processing},
  author = {Yu-Sheng Lin and Hung Chang Lu and Yang-Bin Tsao and Yi-Min Chih and Wei-Chao Chen and Shao-Yi Chien},
  journal= {arXiv preprint arXiv:2009.08685},
  year   = {2020}
}

Comments

To be published at IEEE Workshop on Signal Processing System (SiPS 2020)

R2 v1 2026-06-23T18:38:02.288Z