Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
SLaM: Student-Label Mixing for Distillation with Unlabeled Examples
Authors: Vasilis Kontonis, Fotis Iliopoulos, Khoa Trinh, Cenk Baykal, Gaurav Menghani, Erik Vee
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we present a principled method for knowledge distillation with unlabeled examples that we call Student-Label Mixing (SLa M) and we show that it consistently improves over prior approaches by evaluating it on several standard benchmarks. |
| Researcher Affiliation | Collaboration | Vasilis Kontonis UT Austin EMAIL Fotis Iliopoulos Google Research EMAIL Khoa Trinh Google Research EMAIL Cenk Baykal Google Research EMAIL Gaurav Menghani Google Research EMAIL Erik Vee Google Research EMAIL |
| Pseudocode | Yes | In this section we present pseudo-code describing the distillation with unlabeled examples setting and the SLa M method, Algorithm 1. |
| Open Source Code | Yes | Remark B.1. We remark that in our experiments, we observed that not normalizing the mixing operation with k(x) 1 resulted in better results overall. Therefore, the mixing operation used in our experimental evaluation of SLa M is mix(f(x; w); α(x), k(x)) = α(x)f(x; w) + (1 α(x))(1 f(x; w)) top(ys(x); k(x)). For more details we refer the reader to the code provided in the supplementary material. |
| Open Datasets | Yes | CIFAR-{10,100} and Celeb A Here we present our results on CIFAR-{10, 100} [30] and Celeb A [22]. Image Net Here we present the results on Image Net [49]. Large Movies Reviews Dataset Here we present results on the Large Movies Reviews Dataset [39]. |
| Dataset Splits | Yes | For each trial we randomly split dataset C into a small (e.g., 500 examples validation dataset V and an unlabeled training dataset U. |
| Hardware Specification | Yes | We ran our experiments on 64 Cloud TPU v4s each with two cores. |
| Software Dependencies | No | We implemented all algorithms in Python and used the Tensor Flow deep learning library [1]. The paper mentions TensorFlow but does not specify a version number for it or for Python. |
| Experiment Setup | Yes | For the experiments on CIFAR-10/100 and Celeb A we use the Adam optimizer with initial learning rate lr = 0.001. We then proceed according to the following learning rate schedule... For SLa M we always use 0.5 as the lower bound for isotonic regression (i.e., the parameter lb in Algorithm 2). |