reproducibilityindex.ai

Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation

Authors: Taehyeon Kim, Jaehoon Oh, Nak Yil Kim, Sangwook Cho, Se-Young Yun

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate the training and test accuracies according to the change in α in L and τ in LKL (Figure 3).
Researcher Affiliation	Academia	Taehyeon Kim 1 , Jaehoon Oh 2 , Nak Yil Kim1 , Sangwook Cho1 and Se-Young Yun1 1Graduate School of Artiﬁcial Intelligence, KAIST 2Graduate School of Knowledge Service Engineering, KAIST {potter32, jhoon.oh, nakyilkim, sangwookcho, yunseyoung}@kaist.ac.kr
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code to reproduce the experiments is publicly available online at https://github.com/jhoon-oh/kd data/.
Open Datasets	Yes	image classiﬁcation on CIFAR-100 with a family of Wide-Res Net (WRN) [Zagoruyko and Komodakis, 2016b] and Image Net with a family of of Res Net (RN) [He et al., 2016].
Dataset Splits	No	The paper mentions training and testing datasets (CIFAR-100, ImageNet) but does not provide specific training/validation/test dataset splits or explicit mention of a validation set in the experimental setup.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions 'Py Torch SGD optimizer' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	We used a standard Py Torch SGD optimizer with a momentum of 0.9, weight decay, and apply standard data augmentation. Other than those mentioned, the training settings from the original papers [Heo et al., 2019a; Cho and Hariharan, 2019] were used.