Progressive Cognitive Human Parsing

Authors: Bingke Zhu, Yingying Chen, Ming Tang, Jinqiao Wang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experiments indicate that our method has a better location capacity for the small objects and a better classification capacity for the large objects. Moreover, our framework can be embedded into any fully convolutional network to enhance the performance significantly.
Researcher Affiliation Academia 1National Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China
Pseudocode No The paper describes the architecture and processes in detail, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing the source code or a link to a code repository.
Open Datasets Yes We evaluate our algorithm on the public human parsing dataset, PASCAL-Person-Part (Chen et al. 2014)
Dataset Splits No The paper states "We only use the images containing human for training (1716 images) and validation (1817 images)" but does not explicitly specify a distinct test set size or how the overall dataset is split into training, validation, and test subsets.
Hardware Specification Yes All of our experiments are implemented on a system of Core E5-2660 @2.60GHz CPU and four NVIDIA Ge Force GTX TITAN X GPUs with 12GB memory.
Software Dependencies No The paper mentions using the "Caffe platform" but does not specify its version or the versions of any other ancillary software components like Open MPI.
Experiment Setup Yes We utilize the stochastic gradient descent (SGD) solver with batch size 8, momentum 0.9 and weight decay 0.0005. Inspired by the semantic segmentation optimization (Chen et al. 2017; Zhao et al. 2017), we use the poly learning rate policy 1 iter max iter power. We set the base learning rate as 0.001 and the power as 0.9. As for the input image size, we resize it to 473 473. For data augmentation, we add random gaussian blur to the images and rotate the images in random degrees from -20 to 20.