TETRIS: TilE-matching the TRemendous Irregular Sparsity

Authors: Yu Ji, Ling Liang, Lei Deng, Youyang Zhang, Youhui Zhang, Yuan Xie

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our method on three networks of different scales: Le Net on MNIST, VGG14 on CIFAR-10, and VGG16 on Image Net. The first two models are trained from scratch to get the baseline accuracy, and the last model is obtained from torchvision [29]. For retraining the pruned VGG16, we use a learning rate of 0.001 and retrain 20 epochs. In all of our tests, if the block size is larger than the channel dimension of the layer, we will reduce the block size for that layer to ensure that each layer at least has two blocks.
Researcher Affiliation Academia Yu Ji1,2,3 Ling Liang3 Lei Deng3 Youyang Zhang1 Youhui Zhang1,2 Yuan Xie3 {jiy15,zhang-yy15}@mails.tsinghua.edu.cn,zyh02@tsinghua.edu.cn 1Department of Computer Science and Technology, Tsinghua University 2Beijing Innovation Center for Future Chip {lingliang,leideng,yuanxie}@ece.ucsb.edu 3Department of Electrical and Computer Engineering, University of California, Santa Barbara
Pseudocode Yes Algorithm 1 Reordering algorithm
Open Source Code No The paper mentions using a third-party open-source library ('blocksparse') but does not provide access to the authors' own source code for the methodology described.
Open Datasets Yes We test our method on three networks of different scales: Le Net on MNIST, VGG14 on CIFAR-10, and VGG16 on Image Net. The first two models are trained from scratch to get the baseline accuracy, and the last model is obtained from torchvision [29].
Dataset Splits No The paper does not explicitly provide specific training, validation, and test split percentages or sample counts for the datasets used.
Hardware Specification Yes The blocksparse library [25], an open-source GPU kernel for block sparsity, is used for the evaluation on a Titan V GPU.
Software Dependencies No The paper mentions implementing the method in Pytorch and using cu Blas as backend, but it does not specify version numbers for these software components.
Experiment Setup Yes For retraining the pruned VGG16, we use a learning rate of 0.001 and retrain 20 epochs.