reproducibilityindex.ai

DropBlock: A regularization method for convolutional networks

Authors: Golnaz Ghiasi, Tsung-Yi Lin, Quoc V. Le

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that Drop Block works better than dropout in regularizing convolutional networks. On Image Net classiﬁcation, Res Net-50 architecture with Drop Block achieves 78.13% accuracy, which is more than 1.6% improvement on the baseline. On COCO detection, Drop Block improves Average Precision of Retina Net from 36.8% to 38.4%.
Researcher Affiliation	Industry	Golnaz Ghiasi Google Brain Tsung-Yi Lin Google Brain Quoc V. Le Google Brain
Pseudocode	Yes	Algorithm 1 Drop Block
Open Source Code	Yes	The code of these results is in https://github.com/tensorflow/tpu/tree/master/models/ official/resnet. 1https://github.com/tensorﬂow/tpu/tree/master/models/ofﬁcial/resnet 2https://github.com/tensorﬂow/tpu/tree/master/models/experimental/amoeba_net 3https://github.com/tensorﬂow/tpu/tree/master/models/ofﬁcial/retinanet
Open Datasets	Yes	The ILSVRC 2012 classiﬁcation dataset [25] contains 1.2 million training images, 50,000 validation images, and 150,000 testing images. Images are labeled with 1,000 categories. COCO dataset [30]. PASCAL VOC 2012 dataset.
Dataset Splits	Yes	The ILSVRC 2012 classiﬁcation dataset [25] contains 1.2 million training images, 50,000 validation images, and 150,000 testing images. Following the common practice, we report classiﬁcation accuracy on the validation set.
Hardware Specification	Yes	We trained models on Tensor Processing Units (TPUs). The models were trained on TPU with 64 images in a batch.
Software Dependencies	No	The paper mentions using "TensorFlow implementations" but does not specify a version number for TensorFlow or any other software dependencies.
Experiment Setup	Yes	We used the default image size (224 224 for Res Net-50 and 331 331 for Amoeba Net), batch size (1024 for Res Net-50 and 2048 for Amoeba Net) and hyperparameters setting for all the models. We only increased number of training epochs from 90 to 300 for Res Net-50 architecture. The learning rate was decayed by the factor of 0.1 at 100, 200 and 265 epochs. Amoeba Net models were trained for 340 epochs and exponential decay scheme was used for scheduling learning rate. The model was trained using 150 epochs (280k training steps). The initial learning rate 0.08 was applied for ﬁrst 120 epochs and decayed 0.1 at 120 and 140 epochs. We used α = 0.25 and γ = 1.5 for focal loss. We used a weight decay of 0.0001 and a momentum of 0.9.