reproducibilityindex.ai

KDGAN: Knowledge Distillation with Generative Adversarial Networks

Authors: Xiaojie Wang, Rui Zhang, Yu Sun, Jianzhong Qi

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments using real datasets conﬁrm the superiority of KDGAN in both accuracy and training speed.
Researcher Affiliation	Collaboration	Xiaojie Wang University of Melbourne xiaojiew94@gmail.com Rui Zhang University of Melbourne rui.zhang@unimelb.edu.au Yu Sun Twitter Inc. ysun@twitter.com Jianzhong Qi University of Melbourne jianzhong.qi@unimelb.edu.au
Pseudocode	Yes	Algorithm 1: Minibatch stochastic gradient descent training of KDGAN.
Open Source Code	Yes	The code and the data are made available at https://github.com/xiaojiew1/KDGAN/.
Open Datasets	Yes	We use the widely adopted MNIST [27] and CIFAR-10 [26] datasets. We use the Yahoo Flickr Creative Commons 100 Million (YFCC100M) dataset [45] in the experiments.
Dataset Splits	No	The paper mentions 'based on validation performance' but does not provide specific numerical splits or sizes for a validation dataset, only for training and testing.
Hardware Specification	No	The paper mentions general hardware contexts like 'powerful server' vs. 'mobile phone' for the problem description, but does not specify the actual CPU, GPU, or other hardware used for running the experiments.
Software Dependencies	No	The paper mentions 'TensorFlow [1]' but does not provide a specific version number. Other tools like VGGNet, LSTM, word embeddings are mentioned without version details.
Experiment Setup	Yes	We use two formulations of the distillation losses including the L2 loss [7] and the Kullback Leibler divergence [23]. We search for the optimal values for the hyperparameters α in [0.0, 1.0], β in [0.001, 1000], and γ in [0.0001, 100] based on validation performance. We ﬁnd that a reasonable annealing schedule for the temperature parameter τ is to start with a large value (1.0) and exponentially decay it to a small value (0.1).