Evolving Normalization-Activation Layers

Authors: Hanxiao Liu, Andy Brock, Karen Simonyan, Quoc Le

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that Evo Norms work well on image classification models including Res Nets, Mobile Nets and Efficient Nets but also transfer well to Mask R-CNN with FPN/Spine Net for instance segmentation and to Big GAN for image synthesis, outperforming Batch Norm and Group Norm based layers in many cases.
Researcher Affiliation Industry Google Research, Brain Team Deep Mind {hanxiaol,ajbrock,simonyan,qvl}@google.com
Pseudocode Yes See pseudocode in Appendix A for details.
Open Source Code Yes Code for Evo Norms on Res Nets: https://github.com/tensorflow/tpu/tree/master/models/official/resnet
Open Datasets Yes We include experimental details in Appendix C, including those for the proxy task, search, reranking, and full-fledged evaluations. In summary, we did the search on CIFAR-10, and re-ranked the top-10 layers on a held-out set of Image Net to obtain the best 3 layers.
Dataset Splits Yes We discard layers that achieve less than 20%2 CIFAR-10 validation accuracy after training for 100 steps.
Hardware Specification No Evolution on CIFAR-10 took 2 days to complete with 5000 CPU workers.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1).
Experiment Setup Yes We include experimental details in Appendix C, including those for the proxy task, search, reranking, and full-fledged evaluations. In summary, we did the search on CIFAR-10, and re-ranked the top-10 layers on a held-out set of Image Net to obtain the best 3 layers. ... Hyperparameters are inherited from the original implementations (usually in favor of BNs) without tuning w.r.t Evo Norms.