Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
Authors: Ashraful Islam, Chun-Fu (Richard) Chen, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Richard J. Radke
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our model outperforms the current state-of-the art method by 4.4% for 1-shot and 3.6% for 5-shot classification in the BSCD-FSL benchmark, and also shows competitive performance on traditional in-domain few-shot learning task. |
| Researcher Affiliation | Collaboration | Ashraful Islam Rensselaer Polytechnic Institute islama6@rpi.edu Chun-Fu Chen MIT-IBM Watson AI Lab chenrich@us.ibm.com Rameswar Panda MIT-IBM Watson AI Lab rpanda@ibm.com Leonid Karlinsky IBM Research leonidka@il.ibm.com Rogerio Feris IBM Research rsferis@us.ibm.com Richard J. Radke Rensselaer Polytechnic Institute rjradke@ecse.rpi.edu |
| Pseudocode | Yes | Please refer to the Appendix for Py Torch-like pseudo-code of our method. |
| Open Source Code | Yes | Our code is available at: https://git.io/Jilgs. |
| Open Datasets | Yes | Dataset We follow the evaluation protocol of the BSCD-FSL benchmark [7], which contains novel data from Crop Disease [17], Euro SAT [8], ISIC [4], and Chest X [32]. The base dataset is mini Image Net [29], which contains 100 classes from Image Net dataset [5] where each class has 600 images. The novel datasets are chosen based on increasing dissimilarity from the mini Image Net dataset. More details about the datasets are provided in the Appendix. Following [18], we randomly sample 20% of the data from each novel dataset to construct the unlabeled set DU, and the remaining images are used for evaluation, where we perform 5-way 1-shot and 5-way 5-shot classification. For evaluation metric, we report top-1 accuracy and 95% confidence interval over 600 runs. We also report evaluation results on the larger tiered Image Net dataset [19]. |
| Dataset Splits | Yes | The classes are grouped into 34 super-categories, from which 20 training categories (351 classes), 6 validation categories (97 classes), and 8 testing categories (160 classes) are selected. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) are provided for running experiments. General statements like 'Res Net-10 as the backbone network' refer to the model architecture, not the hardware used for computation. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch, TensorFlow, CUDA) are explicitly mentioned. |
| Experiment Setup | Yes | We use SGD with momentum 0.9, weight decay 1e-4, learning rate 0.01, batch size 32, and the cosine learning rate scheduler. During training, we increase the weight of distillation loss λ from 0 to 1 until 40 epochs using cosine scheduling. The sharpening temperature and teacher momentum parameter are set to 0.1 and 0.99 respectively. |