Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Authors: Yang Shu, Zhangjie Cao, Ziyang Zhang, Jianmin Wang, Mingsheng Long
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment results on computer vision and reinforcement learning tasks demonstrate that the proposed Hub-Pathway framework achieves the state-of-the-art performance for model hub transfer learning. |
| Researcher Affiliation | Collaboration | Yang Shu, Zhangjie Cao, Ziyang Zhang , Jianmin Wang, Mingsheng Long B School of Software, BNRist, Tsinghua University, China Advanced Computing and Storage Lab, Huawei Technologies Co. Ltd {shuyang5656,caozhangjie14}@gmail.com, {jimwang,mingsheng}@tsinghua.edu.cn |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | We follow this setting and use 5 Res Net-50 models pre-trained on 5 widely-used computer vision datasets: (1) Supervised pre-trained model and (2) Unsupervised pre-trained model with MOCO [26] on Image Net [55], (3) Mask R-CNN [28] model for detection and instance segmentation, (4) Deep Lab V3 [8] model for semantic segmentation, and (5) Keypoint R-CNN model for keypoint detection, pre-trained on COCO-2017 challenge datasets of each task. We verify the efficacy of Hub-Pathway on 7 various classification tasks evaluated in [59], including: (1) General classification benchmarks with CIFAR-100 [39] and COCO-70 [70]; (2) Fine-grained benchmarks for aircraft (FGVC Aircraft [45]), car (Stanford Cars [38]) and indoor scene (MIT-Indoors [51]) classification; (3) Specialized benchmarks collected from the Deep Mind Lab environment (DMLab [4]) and Sentinel-2 satellite (Euro SAT [30]). We use the same model hub as the image classification setting in Section 4.1 and transfer to three facial landmark detection tasks: 300W [57], WFLW [66], and COFW [7]. We use the Seaquest and Riverraid tasks as source tasks and construct the model hub consisting of optimal policies for the source tasks. We then transfer to 3 downstream tasks: Alien, Gopher, and James Bond. |
| Dataset Splits | Yes | For all the datasets, we follow the same dataset split as in [59]. For the facial landmark detection experiments, we generally follow the training and testing protocols in [60] and the standard training scheme in [66]. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like SGD optimizer, Adam optimizer, ResNet, etc., but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We adopt the SGD optimizer with an initial learning rate of 0.01 and momentum of 0.9. The models are trained for 15k iterations with a batch size of 48. The learning rate is decayed by a rate of 0.1 at the 6k-th and 12k-th iterations. The models are trained for 60 epochs with a batch size of 16 using the Adam optimizer. The learning rate is set as 0.0001 initially and is decayed by a rate of 0.1 at the 30-th and 50-th epochs. |