Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models

Authors: Yang Shu, Zhangjie Cao, Ziyang Zhang, Jianmin Wang, Mingsheng Long

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results on computer vision and reinforcement learning tasks demonstrate that the proposed Hub-Pathway framework achieves the state-of-the-art performance for model hub transfer learning.
Researcher Affiliation Collaboration Yang Shu, Zhangjie Cao, Ziyang Zhang , Jianmin Wang, Mingsheng Long B School of Software, BNRist, Tsinghua University, China Advanced Computing and Storage Lab, Huawei Technologies Co. Ltd {shuyang5656,caozhangjie14}@gmail.com, {jimwang,mingsheng}@tsinghua.edu.cn
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes We follow this setting and use 5 Res Net-50 models pre-trained on 5 widely-used computer vision datasets: (1) Supervised pre-trained model and (2) Unsupervised pre-trained model with MOCO [26] on Image Net [55], (3) Mask R-CNN [28] model for detection and instance segmentation, (4) Deep Lab V3 [8] model for semantic segmentation, and (5) Keypoint R-CNN model for keypoint detection, pre-trained on COCO-2017 challenge datasets of each task. We verify the efficacy of Hub-Pathway on 7 various classification tasks evaluated in [59], including: (1) General classification benchmarks with CIFAR-100 [39] and COCO-70 [70]; (2) Fine-grained benchmarks for aircraft (FGVC Aircraft [45]), car (Stanford Cars [38]) and indoor scene (MIT-Indoors [51]) classification; (3) Specialized benchmarks collected from the Deep Mind Lab environment (DMLab [4]) and Sentinel-2 satellite (Euro SAT [30]). We use the same model hub as the image classification setting in Section 4.1 and transfer to three facial landmark detection tasks: 300W [57], WFLW [66], and COFW [7]. We use the Seaquest and Riverraid tasks as source tasks and construct the model hub consisting of optimal policies for the source tasks. We then transfer to 3 downstream tasks: Alien, Gopher, and James Bond.
Dataset Splits Yes For all the datasets, we follow the same dataset split as in [59]. For the facial landmark detection experiments, we generally follow the training and testing protocols in [60] and the standard training scheme in [66].
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions software components like SGD optimizer, Adam optimizer, ResNet, etc., but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We adopt the SGD optimizer with an initial learning rate of 0.01 and momentum of 0.9. The models are trained for 15k iterations with a batch size of 48. The learning rate is decayed by a rate of 0.1 at the 6k-th and 12k-th iterations. The models are trained for 60 epochs with a batch size of 16 using the Adam optimizer. The learning rate is set as 0.0001 initially and is decayed by a rate of 0.1 at the 30-th and 50-th epochs.