reproducibilityindex.ai

HRFormer: High-Resolution Vision Transformer for Dense Predict

Authors: YUHUI YUAN, Rao Fu, Lang Huang, Weihong Lin, Chao Zhang, Xilin Chen, Jingdong Wang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on image classiﬁcation, pose estimation, and semantic segmentation tasks, and achieve competitive performance on various benchmarks. For example, HRFormer-B gains +1.0% top-1 accuracy on Image Net classiﬁcation over Dei T-B [42] with 40% fewer parameters and 20% fewer FLOPs. HRFormer-B gains 0.9% AP over HRNet-W48 [41] on COCO val set with with 32% fewer parameters and 19% fewer FLOPs. HRFormer-B + OCR gains +1.2% and +2.0% m Io U over HRNet-W48 + OCR [55] with 25% fewer parameters and slightly more FLOPs on PASCAL-Context test and COCO-Stuff test, respectively.
Researcher Affiliation	Collaboration	1University of Chinese Academy of Sciences 2Institute of Computing Technology, CAS 3Peking University 4Microsoft Research Asia 5Baidu
Pseudocode	No	The paper includes diagrams illustrating the HRFormer block (Figure 1) and architecture (Figure 2), but it does not provide any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at: https://github.com/HRNet/HRFormer.
Open Datasets	Yes	We train our model on COCO train 2017 dataset, including 57K images and 150K person instances. We evaluate our approach on the val 2017 set and test-dev 2017, containing 5K images and 20K images, respectively.
Dataset Splits	Yes	We train our model on COCO train 2017 dataset, including 57K images and 150K person instances. We evaluate our approach on the val 2017 set and test-dev 2017, containing 5K images and 20K images, respectively.
Hardware Specification	Yes	Each HRFormer experiment on COCO pose estimation task takes 8 32G-V100 GPUs. Each HRFormer + OCR experiment on Cityscapes takes 8 32G-V100 GPUs. HRFormer-T and HRFormer-S require 8 32G-V100 GPUs and HRFormer-B requires 32 32G-V100 GPUs.
Software Dependencies	No	The paper mentions 'mmpose [8]' and 'Adam W' as part of the training settings, but does not provide specific version numbers for these or any other software libraries or frameworks (e.g., PyTorch, TensorFlow, Python version) used for the experiments.
Experiment Setup	Yes	We set the initial learning rate as 0.0001, weight decay as 0.01, crop size as 1024 512, batch size as 8, and training iterations as 80K by default.