reproducibilityindex.ai

Not All Bits have Equal Value: Heterogeneous Precisions via Trainable Noise

Authors: Pedro Savarese, Xin Yuan, Yanjing Li, Michael Maire

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that it finds highly heterogeneous precision assignments for CNNs trained on CIFAR and Image Net, improving upon previous state-of-the-art quantization methods. Our improvements extend to the challenging scenario of learning reduced-precision GANs.
Researcher Affiliation	Academia	Pedro Savarese TTI-Chicago savarese@ttic.edu Xin Yuan University of Chicago yuanx@uchicago.edu Yanjing Li University of Chicago yanjingl@uchicago.edu Michael Maire University of Chicago mmaire@uchicago.edu
Pseudocode	Yes	Algorithm 1 SMOL
Open Source Code	No	[No] We will release full source code upon paper acceptance.
Open Datasets	Yes	We ﬁrst compare SMOL against different quantization methods on the small-scale CIFAR-10 dataset
Dataset Splits	Yes	We ﬁrst compare SMOL against different quantization methods on the small-scale CIFAR-10 dataset
Hardware Specification	No	Our experiments involve training standard deep neural network models on modern GPUs; we include details on training epochs used in all experiments. a batch size of 256 which is distributed across 4 GPUs. This does not provide specific models or types of GPUs.
Software Dependencies	No	For all experiments we train the auxiliary parameters s with Adam [22], using the default learning rate of 10 3 and no weight decay all its other hyperparameters are set to their default values. No specific version numbers for software or libraries.
Experiment Setup	Yes	We adopt the standard data augmentation procedure of applying random translations and horizontal ﬂips to training images, and train each network for a total of 650 epochs: the precisions are trained with SMOL for the ﬁrst 350 while the remaining 300 are used to ﬁne-tune the weights while the precisions remain ﬁxed. ... To train the weights we use SGD with a momentum of 0.9 and an initial learning rate of 0.1, which is decayed at epochs 250, 500, and 600. We use a batch size of 128 and a weight decay of 10 4 for Res Net-20, 4 10 5 for Mobile Net V2, and 5 10 4 for Shufﬂe Net.