Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Training Quantized Nets: A Deeper Understanding
Authors: Hao Li, Soham De, Zheng Xu, Christoph Studer, Hanan Samet, Tom Goldstein
NeurIPS 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We investigate training methods for quantized neural networks from a theoretical viewpoint. We first explore accuracy guarantees for training methods under convexity assumptions. We then look at the behavior of these algorithms for non-convex problems, and show that training algorithms that exploit high-precision representations have an important greedy search phase that purely quantized training methods lack, which explains the difficulty of training using low-precision arithmetic. [...] 6 Experiments To explore the implications of the theory above, we train both VGG-like networks [24] and Residual networks [25] with binarized weights on image classification problems. On CIFAR-10, we train Res Net-56, wide Res Net-56 (WRN-56-2, with 2X more filters than Res Net-56), VGG-9, and the high capacity VGG-BC network used for the original BC model [5]. We also train Res Net-56 on CIFAR-100, and Res Net-18 on Image Net [26]. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Maryland, College Park 2School of Electrical and Computer Engineering, Cornell University EMAIL, EMAIL |
| Pseudocode | No | The paper describes algorithms using mathematical equations (e.g., Eq 2, 4, 5, 6, 7) but does not provide structured pseudocode blocks or a section explicitly labeled 'Algorithm' or 'Pseudocode'. |
| Open Source Code | No | The paper does not provide any concrete access information for source code, such as a repository link or an explicit statement of code release. |
| Open Datasets | Yes | On CIFAR-10, we train Res Net-56, wide Res Net-56 (WRN-56-2, with 2X more filters than Res Net-56), VGG-9, and the high capacity VGG-BC network used for the original BC model [5]. We also train Res Net-56 on CIFAR-100, and Res Net-18 on Image Net [26]. |
| Dataset Splits | Yes | The image pre-processing and data augmentation procedures are the same as [25]. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for the experiments are mentioned in the paper. |
| Software Dependencies | No | We use Adam [27] as our baseline optimizer... No version numbers for Adam or any other software dependencies are provided. |
| Experiment Setup | Yes | We set the initial learning rate to 0.01 and decrease the learning rate by a factor of 10 at epochs 82 and 122 for CIFAR-10 and CIFAR-100 [25]. For Image Net experiments, we train the model for 90 epochs and decrease the learning rate at epochs 30 and 60. [...] To verify this, we tried different batch sizes for SR including 128, 256, 512 and 1024, and found that the larger the batch size, the better the performance of SR. |