reproducibilityindex.ai

Shadow Knowledge Distillation: Bridging Offline and Online Knowledge Transfer

Authors: Lujun Li, ZHE JIN

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on classification and object detection tasks demonstrate that our technique achieves state-of-the-art results with different CNNs and Vision Transformer models.
Researcher Affiliation	Academia	Lujun Li1,2, , Jin Zhe1, School of Artificial Intelligence, Anhui University, China Chinese Academy of Science, China lilujunai@gmail.com; jinzhe@ahu.edu.cn
Pseudocode	No	The paper describes algorithms and methods but does not include a formal 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	Code is made publicly available at https://lilujunai.github.io/SHAKE/.
Open Datasets	Yes	We conduct extensive experiments on multiple tasks (e.g.,classification and detection) and datasets (e.g., CIFAR-10, CIFAR-100, Tiny-Image Net, Image Net, and MS-COCO) to verify the superiority of the proposed method.
Dataset Splits	Yes	With CRD s settings [54], whose training epochs are 240, we perform experiments on several teacher-student models on CIFAR-100, either using the same architecture style or a different one. We employ a conventional SGD optimizer with a weight decay of 0.0005 and a mini-batch size of 64. Initialized at 0.05, the multi-step learning rate decrements by 0.1 at 150, 180, and 210 epochs.
Hardware Specification	Yes	Training time is measured on a single 2080Ti GPU, and represents the improving ratios than KD.
Software Dependencies	No	The paper mentions frameworks like 'Detectron2' but does not specify any software dependencies with version numbers.
Experiment Setup	Yes	We choose λ and τ to be 1 and 4 in SHAKE, respectively. We employ a conventional SGD optimizer with a weight decay of 0.0005 and a mini-batch size of 64. Initialized at 0.05, the multi-step learning rate decrements by 0.1 at 150, 180, and 210 epochs.