reproducibilityindex.ai

Continual Normalization: Rethinking Batch Normalization for Online Continual Learning

Authors: Quang Pham, Chenghao Liu, Steven HOI

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on different continual learning algorithms and online scenarios show that CN is a direct replacement for BN and can provide substantial performance improvements.
Researcher Affiliation	Collaboration	1 Singapore Management University hqpham.2017@smu.edu.sg 2 Salesforce Research Asia {chenghao.liu, shoi}@salesforce.com
Pseudocode	Yes	In the following, we provide the CN s implementation based on Pytorch (Paszke et al., 2017). class CN(_Batch Norm): def __init__(self, num_features, eps = 1e-5, G = 32, momentum): super(_CN, self).__init__(num_features, eps, momentum) self.G = G def forward(self, input): out_gn = F.group_norm(input, self.G, None, None, self.eps) out = F.batch_norm(out_gn, self.running_mean, self.running_var, self.weight, self.bias, self.training, self.momentum, self.eps) return out
Open Source Code	Yes	Our implementation is available at https://github.com/phquang/Continual-Normalization.
Open Datasets	Yes	We consider a toy experiment on the permuted MNIST (p MNIST) benchmark (Lopez-Paz & Ranzato, 2017)... We follow the standard setting in Chaudhry et al. (2019a) to split the original CIFAR100 (Krizhevsky & Hinton, 2009) or Mini IMN (Vinyals et al., 2016) datasets...
Dataset Splits	Yes	We follow the standard setting in Chaudhry et al. (2019a) to split the original CIFAR100 (Krizhevsky & Hinton, 2009) or Mini IMN (Vinyals et al., 2016) datasets into a sequence of 20 tasks, three of which are used for hyper-parameter cross-validation, and the remaining 17 tasks are used for continual learning.
Hardware Specification	No	The paper mentions 'our GPU' in Appendix D.5 but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for experiments.
Software Dependencies	No	The paper mentions 'Pytorch (Paszke et al., 2017)' and specific optimizers (SGD, Adam Kingma & Ba (2014)) but does not provide version numbers for these software dependencies or other libraries.
Experiment Setup	Yes	All methods use a standard Res Net 18 backbone (He et al., 2016) (not pre-trained) and are optimized over one epoch with batch size 10 using the SGD optimizer. For each continual learning strategies, we compare our proposed CN with ﬁve competing normalization layers... We cross-validate and set the number of groups to be G = 32 for our CN and GN in this experiment.