Do Adversarially Robust ImageNet Models Transfer Better?
Authors: Hadi Salman, Andrew Ilyas, Logan Engstrom, Ashish Kapoor, Aleksander Madry
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We carry out our study on a suite of image classification tasks (summarized in Table 1), object detection, and instance segmentation. In this work, we identify another factor that affects transfer learning performance: adversarial robustness [Big+13; Sze+14]. We find that despite being less accurate on Image Net, adversarially robust neural networks match or improve on the transfer performance of their standard counterparts. |
| Researcher Affiliation | Collaboration | Hadi Salman hadi.salman@microsoft.com Microsoft Research Andrew Ilyas ailyas@mit.edu Logan Engstrom engstrom@mit.edu Ashish Kapoor akapoor@microsoft.com Microsoft Research Aleksander M adry madry@mit.edu |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | Yes | Our code and models are available at https://github.com/Microsoft/robust-models-transfer. |
| Open Datasets | Yes | A prototypical transfer learning pipeline in computer vision (and the focus of our work) starts with a model trained on the Image Net-1K dataset [Den+09; Rus+15], and then refines this model for the target task. To resolve these two conflicting hypotheses, we use a test bed of 12 standard transfer learning datasets (all the datasets considered in [KSL19] as well as Caltech-256 [GHP07]) We evaluate with benchmarks in both object detection (PASCAL Visual Object Classes (VOC) [Eve+10] and Microsoft COCO [Lin+14]) and instance segmentation (Microsoft COCO). |
| Dataset Splits | No | The paper mentions that 'The hyperparameters for training were found via grid search (cf. Appendix A)' and refers to specific datasets, but it does not explicitly provide the training, validation, and test dataset splits (e.g., percentages or counts) within the main text. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or cloud resources used for running the experiments. |
| Software Dependencies | No | The paper mentions using the 'Detectron2 [Wu+19] framework' but does not provide specific version numbers for this or any other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | For the full experimental setup, see Appendix A. The hyperparameters for training were found via grid search (cf. Appendix A). We train systems using default models and hyperparameter configurations from the Detectron2 [Wu+19] framework... Appendix C describes further experimental details and more results. |