Explicit Regularisation in Gaussian Noise Injections
Authors: Alexander Camuto, Matthew Willetts, Umut Simsekli, Stephen J. Roberts, Chris C. Holmes
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here we derive the explicit regulariser of GNIs, obtained by marginalising out the injected noise, and show that it penalises functions with high-frequency components in the Fourier domain; particularly in layers closer to a neural network s output. We show analytically and empirically that such regularisation produces calibrated classifiers with large classification margins. |
| Researcher Affiliation | Academia | Alexander Camuto University of Oxford Alan Turing Institute acamuto@turing.ac.uk Matthew Willetts University of Oxford Alan Turing Institute mwilletts@turing.ac.uk Umut S ims ekli University of Oxford Institut Polytechnique de Paris umut.simsekli@telecom-paris.fr Stephen Roberts University of Oxford Alan Turing Institute sjrob@robots.ox.ac.uk Chris Holmes University of Oxford Alan Turing Institute cholmes@stats.ox.ac.uk |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | In (a,b) we plot R( ) vs E [C( )] at initialisation for 6-layer-MLPs with GNIs at each 256-neuron layer with the same variance σ2 2 [0.1, 0.25, 1.0, 4.0] at each layer. Each point corresponds to one of 250 different network initialisation acting on a batch of size 32 for the classification dataset CIFAR10 and regression dataset Boston House Prices (BHP) datasets. Figure (a) shows the test set loss for convolutional models (CONV) and 4 layer MLPs trained on SVHN with R( ) and GNIs for σ2 = 0.1, and no noise (Baseline). |
| Dataset Splits | No | The paper mentions using a 'test set' but does not explicitly provide the training, validation, and test dataset splits (e.g., percentages or exact counts) for any of the datasets used. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments (e.g., GPU models, CPU types, or memory specifications). |
| Software Dependencies | No | The paper does not specify any software dependencies with their version numbers. |
| Experiment Setup | Yes | In (a,b) we plot R( ) vs E [C( )] at initialisation for 6-layer-MLPs with GNIs at each 256-neuron layer with the same variance σ2 2 [0.1, 0.25, 1.0, 4.0] at each layer. In (c,d) we plot ratio = | E [C( )] |/R( ) in the first 100 training iteration for 10 randomly initialised networks. Shading corresponds to the standard deviation of values over the 10 networks. Figure (a) shows the test set loss for convolutional models (CONV) and 4 layer MLPs trained on SVHN with R( ) and GNIs for σ2 = 0.1, and no noise (Baseline). We use 6-layer deep 256-unit wide Re LU networks on the same dataset as in Figure 4 trained with (GNI) and without GNI (Baseline). |