Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations
Authors: Xianjie Chen, Alan L. Yuille
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments on two standard pose estimation benchmarks: LSP dataset [10] and FLIC dataset [20]. Our method outperforms the state of the art methods by a significant margin on both datasets. We also do cross-dataset evaluation on Buffy dataset [7] (without training on this dataset) and obtain strong results which shows the ability of our model to generalize. |
| Researcher Affiliation | Academia | Xianjie Chen University of California, Los Angeles Los Angeles, CA 90024 cxj@ucla.edu; Alan Yuille University of California, Los Angeles Los Angeles, CA 90024 yuille@stat.ucla.edu |
| Pseudocode | No | The paper describes algorithms and inference steps in text and equations but does not provide a formal pseudocode block or algorithm box. |
| Open Source Code | No | The paper mentions using Caffe [9] which is an open source framework, but does not state that the code for their specific methodology is open source or provide a link to it. |
| Open Datasets | Yes | We perform our experiments on two publicly available human pose estimation benchmarks: (i) the Leeds Sports Poses (LSP) dataset [10], that contains 1000 training and 1000 testing images...; (ii) the Frames Labeled In Cinema (FLIC) dataset [20] that contains 3987 training and 1016 testing images... To train our models, we also use the negative training images from the INRIAPerson dataset [3]... |
| Dataset Splits | Yes | We hold out random positive images as a validation set for the DCNN training. Also the weight parameters w are trained on this held out set to reduce overfitting to training data. |
| Hardware Specification | Yes | The GPUs used in this research were generously donated by the NVIDIA Corporation. |
| Software Dependencies | No | The paper states 'We use the Caffe [9] implementation of DCNN in our experiments.' without providing a specific version number for Caffe or any other software dependencies. |
| Experiment Setup | Yes | The DCNN consists of five convolutional layers, 2 max-pooling layers and three fully-connected layers with a final |S| dimensions softmax output... We use dropout, local response normalization (norm) and overlapping pooling (pool) described in [12]. The size of input patch is 36 36 pixels on the LSP dataset, and 54 54 pixels on the FLIC dataset. |