reproducibilityindex.ai

Robust Pose Estimation in Crowded Scenes with Direct Pose-Level Inference

Authors: Dongkai Wang, Shiliang Zhang, Gang Hua

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on several crowded scenes pose estimation benchmarks demonstrate the superiority of PINet. For instance, it achieves 59.8% AP on the OCHuman dataset, outperforming the recent works by a large margin .
Researcher Affiliation	Collaboration	Dongkai Wang Peking University dongkai.wang@pku.edu.cn Shiliang Zhang Peking University slzhang.jdl@pku.edu.cn Gang Hua Wormpex AI Research ganghua@gmail.com
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at: https://github.com/kennethwdk/PINet
Open Datasets	Yes	OCHuman [31] is a recently proposed benchmark... Crowd Pose [10] is another dataset... COCO [12] is the popular MPPE benchmark...
Dataset Splits	Yes	OCHuman... including 2500 images for validation and 2231 images for testing. Following [22], we train models on val set and report the performance on test set. Crowd Pose... It contains 10K, 2K and 8K images for train, val and test set. COCO... The train set includes 57K images and 150K person instances... the val set contains 5K images, and the test-dev set consists of 20K images.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions 'Py Torch [19]' but does not provide a specific version number for it or any other key software dependencies.
Experiment Setup	Yes	The input image is resized to 512*512. We use Adam [8] to optimize the model, the learning rate is set to 0.001 for all layers. We train the model for 140 epochs on OCHuman and COCO, with learning rate dividing by 10 at 90th, 120th epoch. For Crowd Pose, we train model with 300 epochs and divide learning rate by 10 at 200th, 260th epoch. The batch size is set to 20 OCHuman and 40 for Crowd Pose and COCO. We adopt data augmentation strategies including random rotation (-30,30), scale ([0.75,1.5]), translation ([-40,40]) and ﬂipping (0.5).