reproducibilityindex.ai

Towards Oracle Knowledge Distillation with Neural Architecture Search

Authors: Minsoo Kang, Jonghwan Mun, Bohyung Han4404-4411

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive experiments on the image classiﬁcation datasets CIFAR-100 and Tiny Image Net using various networks.
Researcher Affiliation	Academia	1Computer Vision Lab., ASRI, Seoul National University, Korea 2Computer Vision Lab., POSTECH, Korea 3Neural Processing Research Center (NPRC), Seoul National University, Korea 1{kminsoo, bhhan}@snu.ac.kr 2jonghwan.mun@postech.ac.kr
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. It presents mathematical formulations but not in a structured algorithm format.
Open Source Code	No	The paper mentions employing 'publicly available ENAS (Pham et al. 2018) code1 for neural architecture search implementation in Tensor Flow' and provides a link to that third-party repository (https://github.com/melodyguan/enas). It does not explicitly state that their own implementation code for the described methodology is open-source or provide a link to it.
Open Datasets	Yes	We evaluate our algorithm on the image classiﬁcation task using CIFAR-100 and Tiny Image Net datasets. CIFAR-100 dataset (Krizhevsky 2009) is composed of 50,000 training and 10,000 testing images in 100 classes
Dataset Splits	Yes	For architecture search, 10% of training images are held out as training-validation set to compute reward.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, or cloud compute instances with specifications) used for running its experiments.
Software Dependencies	No	The paper mentions using 'Tensor Flow (Abadi et al. 2016)' and 'Py Torch (Paszke et al. 2017)' but does not specify their version numbers or any other software dependencies with versions.
Experiment Setup	Yes	we optimize the networks for 300 epochs using SGD with Nesterov momentum (Sutskever et al. 2013) of 0.9, a weight decay of 0.0001 and a batch size of 128. Following (lan, Zhu, and Gong 2018), the initial learning rate is set to 0.1, and is divided by 10 at 150th and 225th epoch, respectively. We also perform warm-up strategy (He et al. 2016) with learning rate of 0.01 with Res Net-110 until 400th and 900th iterations for CIFAR-100 and Tiny Imagenet datasets, respectively. For KD and OD, a temperature T is ﬁxed 3 and a balancing factor λ is set to 0.