reproducibilityindex.ai

How Do Adam and Training Strategies Help BNNs Optimization

Authors: Zechun Liu, Zhiqiang Shen, Shichao Li, Koen Helwegen, Dong Huang, Kwang-Ting Cheng

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments and analysis, we derive a simple training scheme, building on existing Adam-based optimization, which achieves 70.5% top-1 accuracy on the Image Net dataset using the same architecture as the state-of-the-art Re Act Net (Liu et al., 2020) while achieving 1.1% higher accuracy.
Researcher Affiliation	Collaboration	1Hong Kong University of Science and Technology 2Carnegie Mellon University 3Plumerai.
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Code and models are available at https: //github.com/liuzechun/Adam BNN.
Open Datasets	Yes	All the analytical experiments are conducted on the Image Net 2012 classiﬁcation dataset (Russakovsky et al., 2015).
Dataset Splits	No	The paper discusses 'validation accuracy' and 'training accuracy' but does not explicitly provide the specific percentages or sample counts for its training, validation, or test dataset splits needed for reproduction. While ImageNet has standard splits, the paper does not state that these were specifically used or detail its own split methodology.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. It only mentions the use of PyTorch for setting learning rates.
Software Dependencies	No	The paper mentions using PyTorch for setting initial learning rates: 'In this experiment, the initial learning rates for different optimizers are set to the Py Torch (Paszke et al., 2019) default values'. However, it does not provide a specific version number for PyTorch or any other software dependency.
Experiment Setup	Yes	We train the network for 600K iterations with batch size set to 512. The initial learning rate is set to 0.1 for SGD and 0.0025 for Adam, with linear learning rate decay.