reproducibilityindex.ai

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Authors: Mingxing Tan, Quoc Le

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical study shows that it is critical to balance all dimensions of network width/depth/resolution, and surprisingly such balance can be achieved by simply scaling each of them with constant ratio. We demonstrate that our scaling method work well on existing Mobile Nets (Howard et al., 2017; Sandler et al., 2018) and Res Net (He et al., 2016). Notably, the effectiveness of model scaling heavily depends on the baseline network; to go even further, we use neural architecture search (Zoph & Le, 2017; Tan et al., 2019) to develop a new baseline network, and scale it up to obtain a family of models, called Efﬁcient Nets. Figure 1 summarizes the Image Net performance, where our Efﬁcient Nets signiﬁcantly outperform other Conv Nets.
Researcher Affiliation	Industry	1Google Research, Brain Team, Mountain View, CA.
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement or link indicating that the source code for the methodology described is publicly available.
Open Datasets	Yes	We train our Efﬁcient Net models on Image Net using similar settings as (Tan et al., 2019). Image Net (Russakovsky et al., 2015).
Dataset Splits	Yes	Image Net top-1 validation accuracy to 84.3%. Images are randomly picked from Image Net validation set.
Hardware Specification	Yes	Latency is measured with batch size 1 on a single core of Intel Xeon CPU E5-2690.
Software Dependencies	No	The paper mentions software components like "RMSProp optimizer," "batch norm," "swish activation," "Auto Augment," and "stochastic depth," but it does not specify concrete version numbers for any of these or other software libraries (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	Yes	We train our Efﬁcient Net models on Image Net using similar settings as (Tan et al., 2019): RMSProp optimizer with decay 0.9 and momentum 0.9; batch norm momentum 0.99; weight decay 1e-5; initial learning rate 0.256 that decays by 0.97 every 2.4 epochs. We also use swish activation (Ramachandran et al., 2018; Elfwing et al., 2018), ﬁxed Auto Augment policy (Cubuk et al., 2019), and stochastic depth (Huang et al., 2016) with drop connect ratio 0.3. As commonly known that bigger models need more regularization, we linearly increase dropout (Srivastava et al., 2014) ratio from 0.2 for Efﬁcient Net-B0 to 0.5 for Efﬁcient Net-B7.