reproducibilityindex.ai

Up to 100x Faster Data-Free Knowledge Distillation

Authors: Gongfan Fang, Kanya Mo, Xinchao Wang, Jie Song, Shitao Bei, Haofei Zhang, Mingli Song6597-6604

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments over CIFAR, NYUv2, and Image Net demonstrate that the proposed Fast DFKD achieves 10 and even 100 acceleration while preserving performances on par with state of the art.
Researcher Affiliation	Collaboration	Gongfan Fang1,3*, Kanya Mo1 , Xinchao Wang2, Jie Song1 Shitao Bei1, Haofei Zhang1, Mingli Song1,3 1Zhejiang University 2National University of Singapore 3Alibaba-Zhejiang University Joint Research Institute of Frontier Technologies
Pseudocode	Yes	Algorithm 1 Fast DFKD
Open Source Code	Yes	Code is available at https://github.com/zju-vipa/Fast-Datafree.
Open Datasets	Yes	We evaluate the proposed method on both classification and semantic segmentation tasks. For image classification, we conduct data-free knowledge distillation on three widely used datasets: CIFAR-10, CIFAR100 (Krizhevsky, Hinton et al. 2009) and Image Net (Deng et al. 2009). ... For semantic segmentation, we use Deeplab models (Chen et al. 2017) trained on NYUv2 (Nathan Silberman and Fergus 2012) dataset for training and evaluation...
Dataset Splits	No	No specific training/validation/test dataset splits (e.g., percentages, absolute counts, or explicit mention of validation set split) are detailed in the paper text.
Hardware Specification	No	For fair comparisons, all GPU hours are estimated on a single GPU. This is too general and does not specify a model or other hardware details.
Software Dependencies	No	No specific software dependencies with version numbers were mentioned.
Experiment Setup	Yes	We use the pretrained models from (Fang et al. 2021b) and follow the same training protocol for comparison, where 50,000 synthetic images are synthesized for distillation." and "For example, Deep Inv2k synthesizes images by optimizing mini-batches, each of which requires 2,000 iterations to converge (Yin et al. 2019). To obtain 50,000 training samples for CIFAR, Deep Inv2k would take 42.1 hours for data synthesis on a single GPU. by contrast, our method, i.e., Fast-5, adopts the same inversion loss as Deep Inv but only requires 5 steps for each batch owning to the proposed common feature reusing, which is much more efficient than Deep Inversion.