Discovering and Overcoming Limitations of Noise-engineered Data-free Knowledge Distillation
Authors: Piyush Raikwar, Deepak Mishra
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our approach on CIFAR10, CIFAR100, SVHN, and Food101 datasets. |
| Researcher Affiliation | Academia | Piyush Raikwar ABV-IIITM, Gwalior, India imt_2017062@iiitm.ac.in Deepak Mishra IIT Jodhpur, India dmishra@iitj.ac.in |
| Pseudocode | Yes | Algorithm 1 Training KD and Algorithm 2 Evaluation |
| Open Source Code | Yes | Code is available at: https://github.com/Piyush-555/Gaussian Distillation |
| Open Datasets | Yes | We validate our approach on CIFAR10, CIFAR100, SVHN, and Food101 datasets. |
| Dataset Splits | No | The paper mentions "validation set" once in passing, and refers to "test data" and "training data" subsets for finetuning, but does not provide specific train/validation/test split percentages, sample counts, or explicit predefined split citations for reproducibility of data partitioning. |
| Hardware Specification | No | No specific hardware details (GPU/CPU models, memory amounts, or detailed computer specifications) are provided. |
| Software Dependencies | No | The paper mentions using "Adam optimizer" but does not provide specific software dependencies like framework versions (e.g., PyTorch 1.9) or other library versions needed for replication. |
| Experiment Setup | Yes | In both cases, the batch size is 256, and an Adam optimizer with a learning rate of 10^-3 for tuning the parameters of the student network is used. For finetuning, a subset of training data is sampled randomly and a reduced learning rate of 10^-4 is used. |