Four Things Everyone Should Know to Improve Batch Normalization
Authors: Cecilia Summers, Michael J. Dinneen
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our results empirically on six datasets: CIFAR-100, SVHN, Caltech-256, Oxford Flowers-102, CUB-2011, and Image Net. |
| Researcher Affiliation | Academia | Cecilia Summers Department of Computer Science University of Auckland cecilia.summers.07@gmail.com Michael J. Dinneen Department of Computer Science University of Auckland mjd@cs.auckland.ac.nz |
| Pseudocode | No | The paper contains mathematical equations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | We have released code at https://github.com/ceciliaresearch/four_things_ batch_norm. |
| Open Datasets | Yes | We validate our results empirically on six datasets: CIFAR-100, SVHN, Caltech-256, Oxford Flowers-102, CUB-2011, and Image Net. ... Image Net ILSVRC 2012 validation set (Russakovsky et al., 2015) ... CIFAR-100 (Krizhevsky & Hinton, 2009) ... SVHN (Netzer et al., 2011) ... Flowers-102 (Nilsback & Zisserman, 2008) ... CUB-2011 (Wah et al., 2011) |
| Dataset Splits | Yes | Of the six datasets we experiment with, only Image Net (Russakovsky et al., 2015) and Flowers-102 (Nilsback & Zisserman, 2008) have their own pre-defined validation split, so we constructed validation splits for the other datasets as follows: for CIFAR-100 (Krizhevsky & Hinton, 2009), we randomly took 40,000 of the 50,000 training images for the training split, and the remaining 10,000 as a validation split. For SVHN (Netzer et al., 2011), we similarly split the 604,388 non-test images in a 80-20% split for training and validation. For Caltech-256, no canonical splits of any form are defined, so we used 40 images of each of the 256 categories for training, 10 images for validation, and 30 for testing. For CUB-2011, we used 25% of the given training data as a validation set. |
| Hardware Specification | Yes | All experiments were done on two Nvidia Geforce GTX 1080 Ti GPUs. |
| Software Dependencies | No | The paper mentions the TensorFlow-slim image classification model library but does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | The model used for CIFAR-100 and SVHN was Res Net-18 (He et al., 2016b;a) with 64, 128, 256, and 512 filters across blocks. For Caltech-256, a much larger Inception-v3 (Szegedy et al., 2016) model was used, and we additionally experiment with Res Net-152 (He et al., 2016b) on Flowers-102 and CUB-2011 in Sec. 4.3. All experiments were done on two Nvidia Geforce GTX 1080 Ti GPUs. ... with an overall batch size of B and a ghost batch size of B ... with a batch size B = 128. |