reproducibilityindex.ai

CondConv: Conditionally Parameterized Convolutions for Efficient Inference

Authors: Brandon Yang, Gabriel Bender, Quoc V. Le, Jiquan Ngiam

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that scaling networks with Cond Conv improves the performance and inference cost trade-off of several existing convolutional neural network architectures on both classiﬁcation and detection tasks. On Image Net classiﬁcation, our Cond Conv approach applied to Efﬁcient Net-B0 achieves state-of-the-art performance of 78.3% accuracy with only 413M multiply-adds.
Researcher Affiliation	Industry	Brandon Yang Google Brain bcyang@google.com Gabriel Bender Google Brain gbender@google.com Quoc V. Le Google Brain qvl@google.com Jiquan Ngiam Google Brain jngiam@google.com
Pseudocode	No	The paper describes the Cond Conv formulation using mathematical equations and textual explanations, but it does not include a structured pseudocode block or an algorithm figure.
Open Source Code	Yes	Code and checkpoints for the Cond Conv Tensorﬂow layer and Cond Conv-Efﬁcient Net models are available at: https://github.com/tensorflow/tpu/tree/master/ models/official/efficientnet/condconv.
Open Datasets	Yes	We evaluate our approach on the Image Net 2012 classiﬁcation dataset [35]. The Image Net dataset consists of 1.28 million training images and 50K validation images from 1000 classes. We next evaluate the effectiveness of Cond Conv on a different task and dataset with the COCO object detection dataset [24].
Dataset Splits	Yes	The Image Net dataset consists of 1.28 million training images and 50K validation images from 1000 classes. We train all models on the entire training set and compare the single-crop top-1 validation set accuracy with input image resolution 224x224. Following Howard et al. [15], we train on the combined COCO training and validation sets excluding 8,000 minival images, which we evaluate our networks on.
Hardware Specification	No	The paper mentions 'Current accelerators are optimized to train on large batch convolutions' and 'our hardware conﬁguration' but does not specify any particular GPU, CPU, or TPU models used for the experiments.
Software Dependencies	No	The paper states 'Code and checkpoints for the Cond Conv Tensorﬂow layer', implying the use of TensorFlow, but it does not specify any version numbers for TensorFlow or any other software dependencies.
Experiment Setup	Yes	For Mobile Net V1, Mobile Net V2, and Res Net-50, we use the same training hyperparameters for all models on Image Net, following [21], except we use Batch Norm momentum of 0.9 and disable exponential moving average on weights. For Mnas Net [41] and Efﬁcient Net [42], we use the same training hyperparameters as the original papers, with the batch size, learning rate, and training steps scaled appropriately for our hardware conﬁguration. First, we use Dropout [39] on the input to the fully-connected layer preceding the logits, with keep probability between 0.6 and 1.0. Second, we also add data augmentation using the Auto Augment [6] Image Net policy and Mixup [49] with α = 0.2.