reproducibilityindex.ai

Zero-Shot Knowledge Distillation in Deep Networks

Authors: Gaurav Kumar Nayak, Konda Reddy Mopuri, Vaisakh Shaj, Venkatesh Babu Radhakrishnan, Anirban Chakraborty

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of our ZSKD approach via an empirical evaluation over multiple benchmark datasets and model architectures (sec. 4).
Researcher Affiliation	Academia	1Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India 2School of Informatics, University of Edinburgh, United Kingdom 3University of Lincoln, United Kingdom. Correspondence to: Gaurav Kumar Nayak <gauravnayak@iisc.ac.in>
Pseudocode	Yes	Algorithm 1 Zero-Shot Knowledge Distillation
Open Source Code	No	The paper does not provide a direct link to open-source code or explicitly state that the code will be released.
Open Datasets	Yes	MNIST (Le Cun et al., 1998), Fashion MNIST (FMNIST) (Xiao et al., 2017), and CIFAR-10 (Krizhevsky & Hinton, 2009).
Dataset Splits	No	The paper specifies training and test set sizes for MNIST (60000 training, 10000 test), Fashion MNIST (60000 training, 10000 test), and CIFAR-10 (50000 training, 10000 test), but does not explicitly mention a separate validation split.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	Input images are resized from 28x28 to 32x32 and the pixel values are normalized to be in [0, 1] before feeding into the models. We consider two (B = 2) scaling factors, β1 = 1.0 and β2 = 0.1 across all the datasets, i.e., for each dataset, half the Data Impressions are generated with β1 and the other with β2. A temperature value (τ) of 20 is used across all the datasets. We augment the samples using regular operations such as scaling, translation, rotation, ﬂipping etc.