reproducibilityindex.ai

Knowledge Diffusion for Distillation

Authors: Tao Huang, Yuan Zhang, Mingkai Zheng, Shan You, Fei Wang, Chen Qian, Chang Xu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that Diff KD is effective across various types of features and achieves state-of-theart performance consistently on image classification, object detection, and semantic segmentation tasks.
Researcher Affiliation	Collaboration	Tao Huang1,2 Yuan Zhang3 Mingkai Zheng1 Shan You2 Fei Wang4 Chen Qian2 Chang Xu1 1School of Computer Science, Faculty of Engineering, The University of Sydney 2Sense Time Research 3Peking University 4University of Science and Technology of China
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/hunto/Diff KD.
Open Datasets	Yes	The paper uses well-known public datasets such as Image Net, CIFAR-100, COCO dataset, and Cityscapes dataset, citing their original sources or related works.
Dataset Splits	Yes	The paper summarizes training strategies including epochs, batch size, learning rate, optimizer, and data augmentation in Table 1. It also presents validation results in tables, such as the 'Val' column for mIoU in Table 6 for Cityscapes dataset.
Hardware Specification	Yes	We run all the models on 8 V100 GPUs.
Software Dependencies	No	The paper mentions using MMDetection [4] and Torchvision [28] (a PyTorch package) but does not provide specific version numbers for these or other software dependencies, which are necessary for full reproducibility.
Experiment Setup	Yes	The paper provides specific experimental setup details including training strategies (Table 1: epochs, batch size, LR, optimizer, data augmentation), loss weights (e.g., λ1=λ2=λ3=1, Diff KD loss weight to 5), autoencoder latent channel sizes (1024 or 768), and details for semantic segmentation training (random flipping, scaling, crop size, SGD optimizer with momentum 0.9, polynomial LR scheduler).