Pose-Guided Human Parsing by an AND/OR Graph Using Pose-Context Features
Authors: Fangting Xia, Jun Zhu, Peng Wang, Alan Yuille
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on the popular Penn-Fudan pedestrian parsing dataset, showing that it significantly outperforms the state-of-the-arts, and perform diagnostics to demonstrate the effectiveness of different stages of our pipeline. |
| Researcher Affiliation | Academia | Fangting Xia and Jun Zhu and Peng Wang and Alan L. Yuille Department of Statistics University of California, Los Angeles Los Angeles, California 90095 |
| Pseudocode | No | The paper describes algorithms using text and diagrams (Fig. 6 and Fig. 7) but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not state that its code is open-source or provide a link to a code repository for the methodology described. |
| Open Datasets | Yes | We evaluate our algorithm on the Penn-Fudan benchmark (Wang et al. 2007), which consists of pedestrians in outdoor scenes with much pose variation. Because this dataset only provides testing data, following previous works (Bo and Fowlkes 2011; Rauschert and Collins 2012; Luo, Wang, and Tang 2013), we train our parsing models using the Human Eva dataset (Sigal and Black 2006), which contains 937 images with pixellevel label maps for parts annotated by Bo and Fowlkes. The labels of the two datasets are consistent, which include 7 body parts { hair, face, upper-clothes, lower-clothes, arms (arm skin), legs (leg skin), and shoes }. For the pose model, we use the model provided by Chen and Yuille, trained on the Leeds Sports Pose Dataset (Johnson and Everingham 2010). |
| Dataset Splits | No | The paper mentions using Human Eva for training and Penn-Fudan for testing, but it does not specify explicit train/validation/test splits, percentages, or cross-validation details for its model training. |
| Hardware Specification | No | The paper states: 'We also thank NVIDIA for providing us with free GPUs that are used to train deep models.' This mentions GPUs but does not specify exact models (e.g., 'NVIDIA A100', 'RTX 3090'), which is not specific enough for hardware details. |
| Software Dependencies | No | The paper mentions using a FCN16s deep network and SVR models, but it does not specify version numbers for these or any other software libraries or dependencies used (e.g., PyTorch, TensorFlow, scikit-learn). |
| Experiment Setup | Yes | In part ranking & selection, we train linear SVR models for P = 10 part categories and select top np = 10 segments for each part category, as candidates of the final assembling stage. We treat left part and right part as two different part categories. For the segment feature used in the AOG (i.e. the unary term), we first normalize each kind of feature independently, then concatenate them together and normalize the whole feature. All the normalization is done with L2 norm. For simplicity, we only train one SVR model gp(si|L, H) for each part category p so that gp zp = gp, zp 0 in Equ. (6). We set k = 10, making the inference procedure tractable with a moderate number of state configurations for each vertex. |