reproducibilityindex.ai

BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization

Authors: Huanrui Yang, Lin Duan, Yiran Chen, Hai Li

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	BSQ achieves both higher accuracy and higher bit reduction on various model architectures on the CIFAR-10 and Image Net datasets comparing to previous methods.
Researcher Affiliation	Academia	Huanrui Yang, Lin Duan, Yiran Chen & Hai Li Department of Electrical and Computer Engineering Duke University Durham, NC 27708, USA {huanrui.yang, lin.duan, yiran.chen, hai.li}@duke.edu
Pseudocode	No	The paper describes algorithms and processes in text and uses figures to illustrate pipelines, but it does not include formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets	Yes	We use Res Net-20 models on the CIFAR-10 dataset (Krizhevsky & Hinton, 2009)... The CIFAR-10 dataset can be directly accessed through the dataset API provided in the torchvision python package. ...Res Net-50 and Inception-V3 models... are utilized for the experiments on the Image Net dataset (Russakovsky et al., 2015). The Image Net dataset... can be found at http://www.image-net.org/challenges/LSVRC/ 2012/nonpub-downloads.
Dataset Splits	Yes	We use all the data in the provided training set to train our model, and use the provided validation set to evaluate our model and report the testing accuracy.
Hardware Specification	Yes	All the training processes are done on a single TITAN XP GPU. ...Two TITAN RTX GPUs are used in parallel for the BSQ training and ﬁnetuning of both Res Net-50 and Inception-V3 models.
Software Dependencies	No	The paper mentions using 'torchvision python package' and following the 'ofﬁcial Py Torch Image Net example' but does not specify exact version numbers for these or other software dependencies.
Experiment Setup	Yes	The learning rate is set to 0.1 initially, and decayed by 0.1 at epoch 150, 250 and 325. ...The BSQ training is done for 350 epochs, with the ﬁrst 250 epochs using learning rate 0.1 and the rest using learning rate 0.01. ...The ﬁnetuning is performed for 300 epochs with an initial learning rate 0.01 and the learning rate decay by 0.1 at epoch 150 and 250. ...All the training tasks are optimized with the SGD optimizer... with momentum 0.9 and weight decay 0.0001, and the batch size is set to 128.