Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Not All Bits have Equal Value: Heterogeneous Precisions via Trainable Noise

Authors: Pedro Savarese, Xin Yuan, Yanjing Li, Michael Maire

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that it finds highly heterogeneous precision assignments for CNNs trained on CIFAR and Image Net, improving upon previous state-of-the-art quantization methods. Our improvements extend to the challenging scenario of learning reduced-precision GANs.
Researcher Affiliation Academia Pedro Savarese TTI-Chicago EMAIL Xin Yuan University of Chicago EMAIL Yanjing Li University of Chicago EMAIL Michael Maire University of Chicago EMAIL
Pseudocode Yes Algorithm 1 SMOL
Open Source Code No [No] We will release full source code upon paper acceptance.
Open Datasets Yes We first compare SMOL against different quantization methods on the small-scale CIFAR-10 dataset
Dataset Splits Yes We first compare SMOL against different quantization methods on the small-scale CIFAR-10 dataset
Hardware Specification No Our experiments involve training standard deep neural network models on modern GPUs; we include details on training epochs used in all experiments. a batch size of 256 which is distributed across 4 GPUs. This does not provide specific models or types of GPUs.
Software Dependencies No For all experiments we train the auxiliary parameters s with Adam [22], using the default learning rate of 10 3 and no weight decay all its other hyperparameters are set to their default values. No specific version numbers for software or libraries.
Experiment Setup Yes We adopt the standard data augmentation procedure of applying random translations and horizontal flips to training images, and train each network for a total of 650 epochs: the precisions are trained with SMOL for the first 350 while the remaining 300 are used to fine-tune the weights while the precisions remain fixed. ... To train the weights we use SGD with a momentum of 0.9 and an initial learning rate of 0.1, which is decayed at epochs 250, 500, and 600. We use a batch size of 128 and a weight decay of 10 4 for Res Net-20, 4 10 5 for Mobile Net V2, and 5 10 4 for Shuffle Net.