Training binary neural networks with real-to-binary convolutions
Authors: Brais Martinez, Jing Yang, Adrian Bulat, Georgios Tzimiropoulos
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present two main sets of experiments. We used Image Net (Russakovsky et al., 2015) as a benchmark to compare our method against other state-of-the-art approaches in Sec. 5.1. Image Net is the most widely used dataset to report results on binary networks and, at the same time, allows us to show for the first time that binary networks can perform competitively on a large-scale dataset. We further used CIFAR-100 (Krizhevsky & Hinton, 2009) to conduct ablation studies (Sec. 5.2). |
| Researcher Affiliation | Collaboration | Brais Martinez1, Jing Yang1,2,*, Adrian Bulat1,* & Georgios Tzimiropoulos1,2 1 Samsung AI Research Center, Cambridge, UK 2 Computer Vision Laboratory, The University of Nottingham, UK |
| Pseudocode | No | The paper describes procedural steps (e.g., for progressive teacher-student training) in prose, but it does not include formally formatted pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code available at https://github.com/brais-martinez/real2binary. |
| Open Datasets | Yes | We used Image Net (Russakovsky et al., 2015) as a benchmark to compare our method against other state-of-the-art approaches in Sec. 5.1. ... We further used CIFAR-100 (Krizhevsky & Hinton, 2009) to conduct ablation studies (Sec. 5.2). |
| Dataset Splits | Yes | We used Image Net (Russakovsky et al., 2015)... We further used CIFAR-100 (Krizhevsky & Hinton, 2009)... For CIFAR-100, we trained for 350 epochs, with steps at epochs 150, 250 and 320. For Image Net, we train for 75 epochs, with steps at epochs 40, 60 and 70. |
| Hardware Specification | No | The paper does not specify the hardware used for experiments (e.g., specific GPU models, CPU types, or cloud instances). |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | Weight decay: We use 1e 5 when training stage 1 ... and set it to 0 on stage 2. Warm-up: We used warm-up for 5 epochs during stage 1 and no warm-up for stage 2. Optimizer: We used Adam ... The learning rate is set to 1e 3 for stage 1, and 2e 4 for stage 2. For CIFAR-100, we trained for 350 epochs, with steps at epochs 150, 250 and 320. For Image Net, we train for 75 epochs, with steps at epochs 40, 60 and 70. Batch sizes are 256 for Image Net and 128 for CIFAR-100. |