Task-Oriented Feature Distillation
Authors: Linfeng Zhang, Yukang Shi, Zuoqiang Shi, Kaisheng Ma, Chenglong Bao
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that TOFD outperforms other distillation methods by a large margin on both image classification and 3D classification tasks. |
| Researcher Affiliation | Academia | Linfeng Zhang1 , Yukang Shi2 , Zuoqiang Shi1, Kaisheng Ma1 , Chenglong Bao1 Tsinghua University1 Xi an Jiaotong University2 |
| Pseudocode | No | The paper provides figures and mathematical formulations but does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Codes have been released in Github3. https://github.com/Archip Lab-Linfeng Zhang/Task-Oriented-Feature-Distillation |
| Open Datasets | Yes | The experiments of image classification are conducted with nine kinds of convolutional neural networks, including Res Net [17], Pre Act Res Net [18], SENet [25], Res Ne Xt [65], Mobile Net V1 [24], Mobile Net V2 [55], Shuffle Net V1 [42], Shuffle Net V2 [43], Wide Res Net [69] and three datasets, including CIFAR100 and CIFAR10 [33], Image Net [12]. |
| Dataset Splits | No | The paper mentions using well-known datasets like CIFAR100, CIFAR10, Image Net, Model Net10, and Model Net40, but it does not explicitly provide specific details about how the training, validation, and test splits were performed (e.g., specific percentages, sample counts, or explicit references to predefined splits). |
| Hardware Specification | No | The paper describes the experimental settings (e.g., optimizers, batch sizes, epochs) but does not provide any specific details about the hardware used to run the experiments (e.g., GPU models, CPU types, or memory). |
| Software Dependencies | No | The paper describes the experimental settings but does not list any specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | In CIFAR experiment, each model is trained with 300 epochs by SGD optimizer and the batch size is 128. In Image Net experiments, each model is trained with 90 epochs by SGD optimizer and the batch size is 256. Each model is trained with 100 epochs by Adam with learning rate decay in every 20 epochs. |