reproducibilityindex.ai

Ghost Noise for Regularizing Deep Neural Networks

Authors: Atli Kosson, Dongyang Fan, Martin Jaggi

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate the effectiveness of GBN by disentangling the induced Ghost Noise from normalization and quantitatively analyzing the distribution of noise as well as its impact on model performance. We experimentally show that GNI can provide a greater generalization beneﬁt than GBN. Experimentally, we ﬁnd that GNI can deliver a stronger regularization effect than GBN, resulting in improved test performance across a wide range of training settings.
Researcher Affiliation	Academia	EPFL, Switzerland atli.kosson@epﬂ.ch, dongyang.fan@epﬂ.ch, martin.jaggi@epﬂ.ch
Pseudocode	Yes	Figure 2: A minimal Py Torch implementation of Ghost Noise Injection for convolutional activations maps without any performance optimizations.
Open Source Code	No	The paper mentions using open-source libraries like PyTorch Image Models and vit-pytorch, but does not explicitly state that the code for their proposed methods (GNI, XBN) is open-source or provide a link.
Open Datasets	Yes	Krizhevsky, A.; and Hinton, G. 2009. CIFAR-100 (Canadian Institute For Advanced Research). Dataset available from https://www.cs.toronto.edu/ kriz/cifar.html. Deng, L. 2012. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6): 141 142. Table 2: Test accuracy for Res Net-20 CIFAR-10 (mean std% for 3 runs) and Res Net-50 Image Net-1k for different normalization and noise setups.
Dataset Splits	Yes	Figure 3: CIFAR-100 Res Net-18 validation accuracy versus ghost batch size and dropout probability for different methods. The optimal N = 16 gives an accuracy boost of just over 1% on both the validation and the test sets (Table 1). The ghost batch size N was tuned on a validation set and varies considerably between the settings.
Hardware Specification	Yes	All experiments are run on a server with NVIDIA A100 GPUs.
Software Dependencies	Yes	Our experiments are performed using PyTorch (Paszke et al. 2019) in Python 3.9. For the Normalization Free ResNet, Simple ViT, and Conv Mixer, we use the implementations from PyTorch Image Models (Wightman 2019) and vit-pytorch (Wang 2023).
Experiment Setup	Yes	For the CIFAR-100 experiments we use ResNet-18 (He et al. 2016) with SGD optimizer (momentum 0.9, weight decay 5e-4) trained for 200 epochs with a batch size of 256. The initial learning rate is 0.1, decayed by a factor of 10 at epoch 100 and 150. For ImageNet-1k, we use ResNet-50 (He et al. 2016) with the same optimizer and batch size. We train for 100 epochs, with a learning rate decay at 30, 60, 90 epochs.