reproducibilityindex.ai

Learning Low-precision Neural Networks without Straight-Through Estimator (STE)

Authors: Zhi-Gang Liu, Matthew Mattina

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the (AB) method, a 1-bit Binary Net [Hubara et al., 2016a] on CIFAR10 dataset and 8-bits, 4-bits Mobile Net v1, Res Net 50 v1/2 on Image Net are trained using the alpha-blending approach, and the evaluation indicates that AB improves top-1 accuracy by 0.9%, 0.82% and 2.93% respectively compared to the results of STE based quantization [Hubara et al., 2016a] 1 [Krishnamoorthi, 2018] 2 .
Researcher Affiliation	Industry	Zhi-Gang Liu , Matthew Mattina Arm Machine Learning Research Lab {zhi-gang.liu, matthew.mattina}@arm.com
Pseudocode	Yes	Algorithm 1 Alpha-blending optimization (ABO); Algorithm 2 Progressive Project Quantization (PPQ)
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. Footnotes link to third-party models, frameworks, or datasets, not the authors' implementation of Alpha-Blending.
Open Datasets	Yes	To evaluate the (AB) method, a 1-bit Binary Net [Hubara et al., 2016a] on CIFAR10 dataset and 8-bits, 4-bits Mobile Net v1, Res Net 50 v1/2 on Image Net are trained using the alpha-blending approach
Dataset Splits	No	Figure 5 shows validation Loss and accuracy curves, indicating a validation set was used, but the paper does not specify the dataset split percentages or sample counts for training, validation, or testing, nor does it reference predefined splits with citations for these specific proportions.
Hardware Specification	Yes	All evaluations were performed on a x86 64 ubuntu Linux based Xeon server, Lenovo P710, with a Titan V GPU.
Software Dependencies	No	The paper mentions 'TensorFlow' being used for training but does not provide any specific version numbers for TensorFlow or other software libraries.
Experiment Setup	No	The paper describes the general training process for alpha-blending, including how alpha is gradually increased and that the optimization window [T0, T1] is a user-defined hyperparameter, and the learning rate is scaled by (1-alpha). However, it does not provide specific numerical values for these hyperparameters (e.g., T0, T1, initial learning rate, batch size, number of epochs, or specific optimizer configurations).