Teacher as a Lenient Expert: Teacher-Agnostic Data-Free Knowledge Distillation
Authors: Hyunjune Shin, Dong-Wan Choi
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we show that our method successfully achieves both robustness and training stability across various teacher models, while outperforming the existing DFKD methods. |
| Researcher Affiliation | Academia | Hyunjune Shin, Dong-Wan Choi* Department of Computer Science and Engineering, Inha University, South Korea heounjunee@gmail.com, dchoi@inha.ac.kr |
| Pseudocode | No | The paper describes its methods textually and with diagrams, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that its source code is publicly available. |
| Open Datasets | Yes | We use three benchmark datasets, CIFAR-10/CIFAR-100 (Krizhevsky and Hinton 2009) and Tiny Image Net (Deng et al. 2009). |
| Dataset Splits | No | A key challenge in DFKD arises from the unavailability of validation data, making it impossible to accurately evaluate the effectiveness of distillation. The paper does not specify validation data splits for its own experimental setup. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | For all datasets, we train Res Net-34 (He et al. 2016) as the teacher model, Res Net-18 as the student model, and DCGAN (Radford, Metz, and Chintala 2016) as the generator. In the entire DFKD process, we train Res Net-18 along with DCGAN for a particular number of epochs, 200 epochs for CIFAR-10 and 500 epochs for CIFAR-100 and Tiny Image Net. For compared methods, we follow the same configuration of their implementations. Every measurement in this section is taken out of 4 repeated runs. We set k to 10 for CIFAR-10 and 20 for the other datasets. |