reproducibilityindex.ai

Mode Normalization

Authors: Lucas Deecke, Iain Murray, Hakan Bilen

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that our method outperforms BN and other widely used normalization techniques in several experiments, including single and multi-task datasets.
Researcher Affiliation	Academia	Lucas Deecke, Iain Murray & Hakan Bilen University of Edinburgh {l.deecke,i.murray,h.bilen}@ed.ac.uk
Pseudocode	Yes	Algorithm 1 Mode normalization, training phase. Algorithm 2 Mode normalization, test phase. Algorithm 3 Mode group normalization.
Open Source Code	Yes	Accompanying code is available under github.com/ldeecke/mn-torch.
Open Datasets	Yes	i) MNIST (Le Cun, 1998) [...] ii) CIFAR-10 (Krizhevsky, 2009) [...] iii) SVHN (Netzer et al., 2011) [...] iv) Fashion MNIST (Xiao et al., 2017) [...] ILSVRC12 (Deng et al., 2009).
Dataset Splits	Yes	The dataset has a total of 60 000 training samples, as well as 10 000 samples set aside for validation. [...] CIFAR-10 [...] It contains 50 000 training and 10 000 test images.
Hardware Specification	Yes	We gratefully acknowledge the support of Prof. Vittorio Ferrari and Timothy Hospedales for providing computational resources, and the NVIDIA Corporation for the donation of a Titan Xp GPU used in this research.
Software Dependencies	No	All experiments use standard routines within Py Torch (Paszke et al., 2017). While PyTorch is mentioned, a specific version number for the software is not provided.
Experiment Setup	Yes	We trained for 3.5 million data touches (15 epochs), with learning rate reductions by 1/10 after 2.5 and 3 million data touches. [...] The batch size was N =128, and running estimates were kept with λ=0.1. We varied the number of modes in MN over K ={2, 4, 6}. [...] Initial learning rates were set to γ = 10 1, which we reduced by 1/10 at epochs 65 and 80 for all methods. [...] Dropout (Srivastava et al., 2014) is known to occasionally cause issues in combination with BN (Li et al., 2018), and reducing it to 0.25 (as opposed to 0.5 in the original publication) improved performance.