Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning Low-precision Neural Networks without Straight-Through Estimator (STE)
Authors: Zhi-Gang Liu, Matthew Mattina
IJCAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the (AB) method, a 1-bit Binary Net [Hubara et al., 2016a] on CIFAR10 dataset and 8-bits, 4-bits Mobile Net v1, Res Net 50 v1/2 on Image Net are trained using the alpha-blending approach, and the evaluation indicates that AB improves top-1 accuracy by 0.9%, 0.82% and 2.93% respectively compared to the results of STE based quantization [Hubara et al., 2016a] 1 [Krishnamoorthi, 2018] 2 . |
| Researcher Affiliation | Industry | Zhi-Gang Liu , Matthew Mattina Arm Machine Learning Research Lab EMAIL |
| Pseudocode | Yes | Algorithm 1 Alpha-blending optimization (ABO); Algorithm 2 Progressive Project Quantization (PPQ) |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. Footnotes link to third-party models, frameworks, or datasets, not the authors' implementation of Alpha-Blending. |
| Open Datasets | Yes | To evaluate the (AB) method, a 1-bit Binary Net [Hubara et al., 2016a] on CIFAR10 dataset and 8-bits, 4-bits Mobile Net v1, Res Net 50 v1/2 on Image Net are trained using the alpha-blending approach |
| Dataset Splits | No | Figure 5 shows validation Loss and accuracy curves, indicating a validation set was used, but the paper does not specify the dataset split percentages or sample counts for training, validation, or testing, nor does it reference predefined splits with citations for these specific proportions. |
| Hardware Specification | Yes | All evaluations were performed on a x86 64 ubuntu Linux based Xeon server, Lenovo P710, with a Titan V GPU. |
| Software Dependencies | No | The paper mentions 'TensorFlow' being used for training but does not provide any specific version numbers for TensorFlow or other software libraries. |
| Experiment Setup | No | The paper describes the general training process for alpha-blending, including how alpha is gradually increased and that the optimization window [T0, T1] is a user-defined hyperparameter, and the learning rate is scaled by (1-alpha). However, it does not provide specific numerical values for these hyperparameters (e.g., T0, T1, initial learning rate, batch size, number of epochs, or specific optimizer configurations). |