Training Binary Neural Networks using the Bayesian Learning Rule
Authors: Xiangming Meng, Roman Bachmann, Mohammad Emtiyaz Khan
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present numerical experiments to demonstrate the performance of Bayes Bi NN on both synthetic and real image data for different kinds of neural network architectures. |
| Researcher Affiliation | Academia | 1RIKEN Center for Advanced Intelligence Project (AIP), Tokyo, Japan. 2 Ecole polytechnique f ed erale de Lausanne (EPFL), Lausanne, Switzerland. |
| Pseudocode | Yes | STE makes a particular choice for the step 1 where a sign function is used to obtain the binary weights from the real-valued weights (see Table 1 for a pseudocode). |
| Open Source Code | Yes | The code to reproduce the results is available at https:// github.com/team-approx-bayes/Bayes Bi NN. |
| Open Datasets | Yes | We now present results on three benchmark real datasets widely used for image classification: MNIST (Le Cun & Cortes, 2010), CIFAR-10 (Krizhevsky & Hinton, 2009) and CIFAR-100 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | Yes | For all the experiments, standard categorical cross-entropy loss is used and we take 10% of the training set for validation and report the best accuracy on the test set corresponding to the highest validation accuracy achieved during training. |
| Hardware Specification | No | The paper mentions using the 'RAIDEN computing system' for experiments but does not provide specific details such as exact GPU/CPU models, processor types, or memory amounts. |
| Software Dependencies | No | The paper mentions various optimizers and frameworks (e.g., Adam, Bop, Pytorch) but does not provide specific version numbers for any of the software dependencies used in the experiments. |
| Experiment Setup | Yes | The details of the experimental setting, including the detailed network architecture and values of all hyper-parameters, are provided in Appendix B.2 in the supplementary material. |