Generalized Batch Normalization: Towards Accelerating Deep Neural Networks
Authors: Xiaoyong Yuan, Zheng Feng, Matthew Norton, Xiaolin Li1682-1689
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Utilizing the suggested deviation measure and statistic, we show experimentally that training is accelerated more so than with conventional BN, often with improved error rate as well. |
| Researcher Affiliation | Academia | Xiaoyong Yuan University of Florida chbrian@ufl.edu University of Florida fengzheng@ufl.edu Matthew Norton Naval Postgraduate School mnorton@nps.edu Xiaolin Li University of Florida andyli@ece.ufl.edu |
| Pseudocode | No | The paper does not contain any sections or figures explicitly labeled as 'Pseudocode' or 'Algorithm', nor any structured code blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | We demonstrate on MNIST, CIFAR-10, CIFAR-100, and SVHN datasets that the speed of convergence of stochastic gradient descent (SGD) can be increased by simply choosing a different D and S and that, in some settings, we obtain improved predictive performance. |
| Dataset Splits | No | The paper mentions training on datasets and evaluating on a 'held out test set' but does not provide specific details on the train/validation/test split percentages or sample counts for reproduction. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for running experiments, such as GPU models, CPU specifications, or memory. |
| Software Dependencies | No | The paper discusses neural network architectures and optimizers but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We conduct classification on MNIST (Le Cun et al. 1998) with neural network architecture Le Net with the input size of 28x28 and two convolutional layers with kernel size 5, and number of filters 20 and 50 respectively. ... with vanilla SGD as the optimizer, with learning rate equal to .01, and batch size equal to 1000. |