reproducibilityindex.ai

VanillaNet: the Power of Minimalism in Deep Learning

Authors: Hanting Chen, Yunhe Wang, Jianyuan Guo, Dacheng Tao

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimentation demonstrates that Vanilla Net delivers performance on par with renowned deep neural networks and vision transformers, showcasing the power of minimalism in deep learning.
Researcher Affiliation	Collaboration	Hanting Chen1, Yunhe Wang1 , Jianyuan Guo1, Dacheng Tao2 1 Huawei Noah s Ark Lab. 2 School of Computer Science, University of Sydney.
Pseudocode	No	The paper describes the deep training strategy and series activation function but does not present them in a structured pseudocode or algorithm block.
Open Source Code	Yes	Pre-trained models and codes are available at https://github.com/huawei-noah/Vanilla Net and https://gitee.com/mindspore/models/tree/master/research/cv/ vanillanet.
Open Datasets	Yes	To illustrate the effectiveness of the proposed method, we conduct experiments on the Image Net [8] dataset, which consists of 224 224 pixel RGB color images. The Image Net dataset contains 1.28M training images and 50K validation images with 1000 categories.
Dataset Splits	Yes	The Image Net dataset contains 1.28M training images and 50K validation images with 1000 categories.
Hardware Specification	Yes	Latency is tested on Nvidia A100 GPU with batch size of 1.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., programming language versions, library versions like PyTorch, TensorFlow, or scikit-learn).
Experiment Setup	Yes	Table 8: Image Net-1K training settings. This table provides specific values for parameters such as 'weight init trunc. normal (0.2)', 'optimizer LAMB [58]', 'loss function BCE loss', 'base learning rate 3.5e-3 {5,8-13} /4.8e-3 {6-7}', 'weight decay 0.35/0.35/0.35/0.3/0.3/0.25/0.3/0.3/0.3', 'batch size 1024', 'training epochs 300', 'learning rate schedule cosine decay', 'dropout 0.05', and others.