Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Not All Bits have Equal Value: Heterogeneous Precisions via Trainable Noise
Authors: Pedro Savarese, Xin Yuan, Yanjing Li, Michael Maire
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that it finds highly heterogeneous precision assignments for CNNs trained on CIFAR and Image Net, improving upon previous state-of-the-art quantization methods. Our improvements extend to the challenging scenario of learning reduced-precision GANs. |
| Researcher Affiliation | Academia | Pedro Savarese TTI-Chicago EMAIL Xin Yuan University of Chicago EMAIL Yanjing Li University of Chicago EMAIL Michael Maire University of Chicago EMAIL |
| Pseudocode | Yes | Algorithm 1 SMOL |
| Open Source Code | No | [No] We will release full source code upon paper acceptance. |
| Open Datasets | Yes | We first compare SMOL against different quantization methods on the small-scale CIFAR-10 dataset |
| Dataset Splits | Yes | We first compare SMOL against different quantization methods on the small-scale CIFAR-10 dataset |
| Hardware Specification | No | Our experiments involve training standard deep neural network models on modern GPUs; we include details on training epochs used in all experiments. a batch size of 256 which is distributed across 4 GPUs. This does not provide specific models or types of GPUs. |
| Software Dependencies | No | For all experiments we train the auxiliary parameters s with Adam [22], using the default learning rate of 10 3 and no weight decay all its other hyperparameters are set to their default values. No specific version numbers for software or libraries. |
| Experiment Setup | Yes | We adopt the standard data augmentation procedure of applying random translations and horizontal flips to training images, and train each network for a total of 650 epochs: the precisions are trained with SMOL for the first 350 while the remaining 300 are used to fine-tune the weights while the precisions remain fixed. ... To train the weights we use SGD with a momentum of 0.9 and an initial learning rate of 0.1, which is decayed at epochs 250, 500, and 600. We use a batch size of 128 and a weight decay of 10 4 for Res Net-20, 4 10 5 for Mobile Net V2, and 5 10 4 for Shuffle Net. |