Context-Guided Adaptive Network for Efficient Human Pose Estimation

Authors: Lei Zhao, Jun Wen, Pengfei Wang, Nenggan Zheng3492-3499

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimenting on the COCO dataset, our method achieves 68.1 AP at 25.4 fps, and outperforms Mask R-CNN by 8.9 AP at a similar speed. The competitive performance on the HPE and person instance segmentation tasks over the state-of-the-art models show the promise of the proposed method.
Researcher Affiliation Academia 1 Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, China 2 College of Computer Science and Techology, Zhejiang University, Hangzhou, China 3 Collaborative Innovation Center for Artificial Intelligence by MOE and Zhejiang Provincial Government (ZJU) 4 Zhejiang Lab, Hangzhou, China
Pseudocode Yes Algorithm 1 Dichotomy Extended Area (one box, upper boundary).
Open Source Code Yes The source code will be made available at https://github.com/zlcnup/CGANet.
Open Datasets Yes In this section, we evaluate our approach on the COCO dataset (Lin et al. 2014), which contains over 200, 000 images and 250, 000 person instances labeled with 17 keypoints.
Dataset Splits Yes It is divided into train2017/val2017/test-dev2017 sets with 57k, 5k and 20k images respectively.
Hardware Specification Yes We report the inference time (speed) of models using one batch size on the same environment equipped with a single NVIDIA GTX 2080Ti GPU
Software Dependencies Yes CUDA V10.0 and Py Torch 1.4
Experiment Setup Yes We use data augmentation with random scale between 0.6 1.5, random rotation between 45 +45 , random translation between 40 +40 and random flip to crop an input image patch. The aligned feature sizes are 1 16, 1 32 and 1 64 of the training input size, respectively. We use the SGD optimizer for 95 epochs, with an initial learning rate of 1e-2 (dropped to 1e-3 and 1e-4 at the 70th and 85th epochs, respectively).