Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Knowledge Distillation Based on Transformed Teacher Matching
Authors: Kaixiang Zheng, EN-HUI YANG
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiment results demonstrate that thanks to this inherent regularization, TTM leads to trained students with better generalization than the original KD. To further enhance student s capability to match teacher s power transformed probability distribution, we introduce a sample-adaptive weighting coefficient into TTM, yielding a novel distillation approach dubbed weighted TTM (WTTM). It is shown, by comprehensive experiments, that although WTTM is simple, it is effective, improves upon TTM, and achieves state-of-the-art accuracy performance. |
| Researcher Affiliation | Academia | Kaixiang Zheng & En-Hui Yang Department of Electrical and Computer Engineering, University of Waterloo EMAIL |
| Pseudocode | Yes | In this section, we provide the pseudo-code for TTM and WTTM in a Pytorch-like style, shown in Algorithm 1. It s clear that both TTM and WTTM are quite easy to implement. |
| Open Source Code | Yes | Our source code is available at https://github.com/zkxufo/TTM. |
| Open Datasets | Yes | We benchmark TTM and WTTM on two prevailing image classification datasets, namely CIFAR100 and Image Net (Deng et al., 2009). |
| Dataset Splits | Yes | CIFAR-100 contains 60k 32 32 color images of 100 classes, with 600 images per class, and it s further split into 50k training images and 10k test images. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) are mentioned for the experimental setup. |
| Software Dependencies | No | The paper mentions 'torchdistill (Matsubara, 2021) library' and 'Py Torch (Paszke et al., 2019)' but does not specify their version numbers or other software dependencies with versions. |
| Experiment Setup | Yes | Note that we list T and β values of all experiments in A.4 for reproducibility. |