reproducibilityindex.ai

What Makes Instance Discrimination Good for Transfer Learning?

Authors: Nanxuan Zhao, Zhirong Wu, Rynson W. H. Lau, Stephen Lin

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our ﬁndings are threefold. First, what truly matters for the transfer is low-level and mid-level representations, not high-level representations. Second, the intra-category invariance enforced by the traditional supervised model weakens transferability by increasing task misalignment. Finally, supervised pretraining can be strengthened by following an exemplar-based approach without explicit constraints among the instances within the same category. We study the transfer performance of pretrained models for a set of downstream tasks: object detection on PASCAL VOC07, object detection and instance segmentation on MSCOCO, and semantic segmentation on Cityscapes.
Researcher Affiliation	Collaboration	1City University of Hong Kong 2Microsoft Research Asia
Pseudocode	No	The paper describes its methods in prose, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper provides a project URL (http://nxzhao.com/projects/good_transfer/), but it does not contain an explicit statement that the source code for the methodology described in this paper is openly released, nor does it link directly to a source code repository like GitHub.
Open Datasets	Yes	We study the transfer performance of pretrained models for a set of downstream tasks: object detection on PASCAL VOC07, object detection and instance segmentation on MSCOCO, and semantic segmentation on Cityscapes. The pretraining method Mo Co (He et al., 2020) established a milestone by outperforming the supervised counterpart, with an AP of 46.6compared to 42.4 on PASCAL VOC object detection. ImageNet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009.
Dataset Splits	Yes	We study the transfer performance of pretrained models for a set of downstream tasks: object detection on PASCAL VOC07, object detection and instance segmentation on MSCOCO, and semantic segmentation on Cityscapes. For the base classes, we split their data into training and validation sets to evaluate base task performance.
Hardware Specification	No	The paper mentions running experiments on "8 GPUs" and "4 GPUs" but does not specify the exact models or other hardware details (CPU, RAM, specific machine types, or cloud instances) used for the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies such as deep learning frameworks (e.g., PyTorch, TensorFlow), programming languages (e.g., Python), or other libraries used in the implementation.
Experiment Setup	Yes	For object detection on PASCAL VOC07, we use the Res Net50-C4 architecture in the Faster R-CNN framework (Ren et al., 2015). Optimization takes 9k iterations on 8 GPUs with a batch size of 2 images per GPU. The learning rate is initialized to 0.02 and decayed to be 10 times smaller after 6k and 8k iterations. For semantic segmentation on Cityscapes, we use the Deep Lab-v3 architecture (Chen et al., 2017) with image crops of 512 by 1024. Optimization takes 40k iterations on 4 GPUs with a batch size of 2 images per GPU. The learning rate is initialized to 0.01 and decayed with a poly schedule.