reproducibilityindex.ai

Learning Student Networks with Few Data

Authors: Shumin Kong, Tianyu Guo, Shan You, Chang Xu4469-4476

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on benchmark datasets validate the effectiveness of our proposed method. and Now we empirically evaluate the proposed algorithm on popular benchmark datasets, including CIFAR-10 dataset, CIFAR-100 dataset and Fashion-MNIST dataset.
Researcher Affiliation	Collaboration	Shumin Kong,1 Tianyu Guo,1,2 Shan You,3 Chang Xu1 1School of Computer Science, Faculty of Engineering, The University of Sydney, Australia 2Key Laboratory of Machine Percepton (MOE), CMIC, School of EECS, Peking University, China 3Sense Time Research, China
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for the methodology is openly available.
Open Datasets	Yes	Now we empirically evaluate the proposed algorithm on popular benchmark datasets, including CIFAR-10 dataset, CIFAR-100 dataset and Fashion-MNIST dataset. followed by citations like (Krizhevsky 2009) for CIFAR and (Xiao, Rasul, and Vollgraf 2017) for Fashion-MNIST.
Dataset Splits	No	The paper specifies training and testing set sizes (e.g., '50,000 of the images are training set and the remaining 10,000 images are intended for testing' for CIFAR-10/100, and '60,000 and 10,000, respectively' for Fashion-MNIST) but does not explicitly mention or detail a specific validation dataset split.
Hardware Specification	Yes	The experiments are run on a single NVIDIA Ge Force 1080 Ti GPU.
Software Dependencies	No	The paper mentions training methods like 'back propagation and Stochastic Gradient Descent (SGD)' but does not specify any software libraries, frameworks, or their version numbers that were used to implement the experiments.
Experiment Setup	Yes	In our experiments, ϵ is set to 1 and α is set to 0.001. Temperature T for KD loss is set to 3. On both datasets, the student networks are trained using back propagation and Stochastic Gradient Descent (SGD) with momentum for 500 epochs. During training, the learning rate and the momentum decay linearly.