Progressively Knowledge Distillation via Re-parameterizing Diffusion Reverse Process
Authors: Xufeng Yao, Fanbin Lu, Yuechen Zhang, Xinyun Zhang, Wenqian Zhao, Bei Yu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present extensive experiments performed on various transfer scenarios, such as CNN-to-CNN and Transformer-to CNN, that validate the effectiveness of our approach. |
| Researcher Affiliation | Academia | Xufeng Yao, Fanbin Lu, Yuechen Zhang, Xinyun Zhang, Wenqian Zhao, Bei Yu Department of Computer Science & Engineering, The Chinese University of Hong Kong {xfyao,fblu21,yczhang21,xyzhang21,wqzhao,byu}@cse.cuhk.edu.hk |
| Pseudocode | No | The main body of the paper does not contain any clearly labeled pseudocode or algorithm blocks. It mentions that more details are in the appendix, but the appendix content is not provided. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or provide a link to a code repository for the described methodology. |
| Open Datasets | Yes | We experiment with different settings varying architectures and datasets, including: CIFAR-100 (Krizhevsky, Hinton et al. 2009) which consists 32 32 images with 100 categories. Training and validation sets are composed of 50k and 10k images. Image Net1k (Deng et al. 2009) which contains over 1280k images with 1000 categories. Image Net100 is a subset of Image Net which contains roughly 120k images. |
| Dataset Splits | Yes | CIFAR-100 (Krizhevsky, Hinton et al. 2009) which consists 32 32 images with 100 categories. Training and validation sets are composed of 50k and 10k images. Image Net100 is a subset of Image Net which contains roughly 120k images. The training and validation splitting rule is introduced in (Wang and Isola 2020). |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Our implementation is mainly based on the DKD... Review... and CRD', implying software dependencies, but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow, etc.). |
| Experiment Setup | No | The paper states 'Our implementation is mainly based on the DKD (Zhao et al. 2022) Review (Chen et al. 2021b) and CRD (Tian, Krishnan, and Isola 2020) with the default training and testing setting,' but it does not provide specific hyperparameter values, optimizer settings, or detailed training configurations within the main text. |