We propose GrateTile, an efficient, hardwarefriendly data storage scheme for sparse CNN feature maps (activations). It divides data into uneven-sized subtensors and, with small indexing overhead, stores them in a compressed yet randomly accessible format. This design enables modern CNN accelerators to fetch and decompressed sub-tensors on-the-fly in a tiled processing manner. GrateTile is suitable for architectures that favor aligned, coalesced data access, and only requires minimal changes to the overall architectural design. We simulate GrateTile with state-of-the-art CNNs and show an average of 55% DRAM bandwidth reduction while using only 0.6% of feature map size for indexing storage.
Cite
@article{arxiv.2009.08685,
title = {GrateTile: Efficient Sparse Tensor Tiling for CNN Processing},
author = {Yu-Sheng Lin and Hung Chang Lu and Yang-Bin Tsao and Yi-Min Chih and Wei-Chao Chen and Shao-Yi Chien},
journal= {arXiv preprint arXiv:2009.08685},
year = {2020}
}
Comments
To be published at IEEE Workshop on Signal Processing System (SiPS 2020)