reproducibilityindex.ai

Batch Normalization Orthogonalizes Representations in Deep Random Networks

Authors: Hadi Daneshmand, Amir Joudaki, Francis Bach

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments presented in Fig. 2a validate the exponential decay rate of V with depth. In this plot, we see that log(Vℓ) linearly decreases for ℓ= 1, . . . , 20, then it wiggles around a small constant. Our experiments in Fig. 2b suggest that the O(1/ d) dependency on width is almost tight.
Researcher Affiliation	Academia	Hadi Daneshmand INRIA Paris seyed.daneshmand@inria.fr Amir Joudaki ETH Zurich amir.joudaki@inf.ethz.ch Francis Bach INRIA-ENS-PSL Paris francis.bach@inria.fr
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Implementations are available at https://github.com/hadidaneshmand/batchnorm21.git
Open Datasets	Yes	The learning task is classiﬁcation with cross entropy loss for CIFAR10 dataset (Krizhevsky et al., 2009, MIT license).
Dataset Splits	No	The paper mentions using the CIFAR10 dataset and a batch size for training, but does not specify exact training/validation/test splits or provide citations to predefined splits for reproduction.
Hardware Specification	Yes	We use Py Torch (Paszke et al., 2019, BSD license) and Google Colaboratory platform with a single Tesla-P100 GPU with 16GB memory in all the experiments.
Software Dependencies	No	The paper mentions
Experiment Setup	Yes	Throughout the experiments, we use vanilla MLP (without BN) with a width of 800 across all hidden layers, Re LU activation, and used Xavier s method for weights intialization (Glorot and Bengio, 2010). We use SGD with stepsize 0.01 and batch size 500 and for training.