reproducibilityindex.ai

Towards Understanding Knowledge Distillation

Authors: Mary Phuong, Christoph Lampert

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To experimentally test the effect of data geometry on the effectiveness of distillation, we adopt the setting of Corollary 2. We consider a series of tasks of varying angular alignment, as measured by the degree, \kappa, of the polynomial by which p(\theta) is upper bounded. Speciﬁcally, for any \kappa, the task (P \kappa x , w\kappa ) is deﬁned by the following sampling procedure... We use an input space dimension of d = 1000 and a transfer set size n = 20. Then, we train a linear student by distillation on each of the tasks and evaluate its transfer risk on held-out data. Figure 3 shows the results.
Researcher Affiliation	Academia	1IST Austria (Institute of Science and Technology Austria).
Pseudocode	No	No pseudocode or algorithm blocks are present in the paper.
Open Source Code	No	The paper does not mention providing open-source code for the described methodology.
Open Datasets	Yes	We train the learners w\delta for \delta {0, 10, . . . , 90} on the digits 0 and 1 of the MNIST dataset
Dataset Splits	No	We set the transfer set size to n = 100 and evaluate the risk on the test set.
Hardware Specification	No	The paper does not provide specific details regarding the hardware used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or version numbers needed to replicate the experiment.
Experiment Setup	Yes	We use an input space dimension of d = 1000 and a transfer set size n = 20... We set the transfer set size to n = 100... We train the learners on the polynomial-angle task (P \kappa x , w\kappa ) from Section 5.1, with \kappa = 1, d = 100 and n = 5.