reproducibilityindex.ai

Balanced Sparsity for Efficient DNN Inference on GPU

Authors: Zhuliang Yao, Shijie Cao, Wencong Xiao, Chen Zhang, Lanshun Nie5676-5683

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiment results show that Balanced Sparsity achieves up to 3.1x practical speedup for model inference on GPU, while retains the same high model accuracy as ﬁnegrained sparsity.
Researcher Affiliation	Collaboration	Zhuliang Yao,1,4, Shijie Cao,2,4, Wencong Xiao,3,4 Chen Zhang,4 Lanshun Nie2 1Tsinghua University 2Harbin Institute of Technology 3Beihang University 4Microsoft Research Asia {v-zhuyao, v-shicao, v-wencxi, zhac}@microsoft.com, nls@hit.edu.cn
Pseudocode	Yes	Algorithm 1: Balance-aware Iterative Pruning Input: The matrix to be pruned, M; The number of blocks per row, Block Num; The expected sparsity, Sparsity; Output: The pruned matrix, Mp;
Open Source Code	Yes	Please refer to https://github.com/Howal/balanced-sparsity/blob/master/ appendix-aaai19.pdf for proof.
Open Datasets	Yes	PTB dataset (Marcus et al. 1999), Image Net ILSVRC-2012 dataset (Krizhevsky, Sutskever, and Hinton 2012), TIMIT dataset
Dataset Splits	Yes	VGG-16... dataset has 1.2M training examples and 50k validation examples.
Hardware Specification	No	The paper mentions experiments were run 'on GPU' and refers to 'GPU architecture' and 'GPU inference performance test' but does not specify any particular GPU model (e.g., NVIDIA A100, Tesla V100), CPU, or other hardware specifications.
Software Dependencies	No	The paper mentions using 'cu BLAS library', 'cu SPARSE library', and an 'open sourced GPU library (Gray, Radford, and Kingma 2017)' but does not specify version numbers for these software components or any other software dependencies.
Experiment Setup	Yes	All the experiments in this section are done with a batch size of 1, the block number per row of our method is 32, and the block size of block sparsity is 8x8, unless explicitly stated.