Training Binary Neural Networks through Learning with Noisy Supervision
Authors: Kai Han, Yunhe Wang, Yixing Xu, Chunjing Xu, Enhua Wu, Chang Xu
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on benchmark datasets indicate that the proposed binarization technique attains consistent improvements over baselines. |
| Researcher Affiliation | Collaboration | 1State Key Lab of Computer Science, Institute of Software, CAS & University of Chinese Academy of Sciences 2Noah s Ark Lab, Huawei Technologies 3University of Macau 4School of Computer Science, Faculty of Engineering, University of Sydney. |
| Pseudocode | Yes | Algorithm 1 Feed-Forward and Back-Propagation Process of Binary Neuron Mapping with Noisy Supervision. |
| Open Source Code | No | No explicit statement about releasing source code or a link to a code repository is found in the paper. |
| Open Datasets | Yes | CIFAR-10 dataset (Krizhevsky & Hinton, 2009) consists of 60,000 32 32 color images belonging to 10 categories, with 6,000 images per category. |
| Dataset Splits | Yes | For hyper-parameter tuning, 10,000 training images are randomly sampled for validation and the rest images are for training. |
| Hardware Specification | Yes | All the models are implemented using Py Torch (Paszke et al., 2019) and conducted on NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | All the models are implemented using Py Torch (Paszke et al., 2019). A specific version number for PyTorch or other libraries is not provided beyond the citation year. |
| Experiment Setup | Yes | For CIFAR-10, Res Net-20 is used as baseline model. The binary baseline models are trained for 400 epochs with a batch size of 128 and an initial learning rate 0.1. We use the SGD optimizer with the momentum of 0.9 and set the weight decay to 0. Our method is fine-tuned based on the pretrained baseline for 120 epochs using SGD optimizer. The learning rate starts from 0.01 and decayed by 0.1 every 30 epochs. |