Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
How Do Adam and Training Strategies Help BNNs Optimization
Authors: Zechun Liu, Zhiqiang Shen, Shichao Li, Koen Helwegen, Dong Huang, Kwang-Ting Cheng
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments and analysis, we derive a simple training scheme, building on existing Adam-based optimization, which achieves 70.5% top-1 accuracy on the Image Net dataset using the same architecture as the state-of-the-art Re Act Net (Liu et al., 2020) while achieving 1.1% higher accuracy. |
| Researcher Affiliation | Collaboration | 1Hong Kong University of Science and Technology 2Carnegie Mellon University 3Plumerai. |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code and models are available at https: //github.com/liuzechun/Adam BNN. |
| Open Datasets | Yes | All the analytical experiments are conducted on the Image Net 2012 classification dataset (Russakovsky et al., 2015). |
| Dataset Splits | No | The paper discusses 'validation accuracy' and 'training accuracy' but does not explicitly provide the specific percentages or sample counts for its training, validation, or test dataset splits needed for reproduction. While ImageNet has standard splits, the paper does not state that these were specifically used or detail its own split methodology. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. It only mentions the use of PyTorch for setting learning rates. |
| Software Dependencies | No | The paper mentions using PyTorch for setting initial learning rates: 'In this experiment, the initial learning rates for different optimizers are set to the Py Torch (Paszke et al., 2019) default values'. However, it does not provide a specific version number for PyTorch or any other software dependency. |
| Experiment Setup | Yes | We train the network for 600K iterations with batch size set to 512. The initial learning rate is set to 0.1 for SGD and 0.0025 for Adam, with linear learning rate decay. |