Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations

Authors: Xianjie Chen, Alan L. Yuille

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform experiments on two standard pose estimation benchmarks: LSP dataset [10] and FLIC dataset [20]. Our method outperforms the state of the art methods by a significant margin on both datasets. We also do cross-dataset evaluation on Buffy dataset [7] (without training on this dataset) and obtain strong results which shows the ability of our model to generalize.
Researcher Affiliation Academia Xianjie Chen University of California, Los Angeles Los Angeles, CA 90024 cxj@ucla.edu; Alan Yuille University of California, Los Angeles Los Angeles, CA 90024 yuille@stat.ucla.edu
Pseudocode No The paper describes algorithms and inference steps in text and equations but does not provide a formal pseudocode block or algorithm box.
Open Source Code No The paper mentions using Caffe [9] which is an open source framework, but does not state that the code for their specific methodology is open source or provide a link to it.
Open Datasets Yes We perform our experiments on two publicly available human pose estimation benchmarks: (i) the Leeds Sports Poses (LSP) dataset [10], that contains 1000 training and 1000 testing images...; (ii) the Frames Labeled In Cinema (FLIC) dataset [20] that contains 3987 training and 1016 testing images... To train our models, we also use the negative training images from the INRIAPerson dataset [3]...
Dataset Splits Yes We hold out random positive images as a validation set for the DCNN training. Also the weight parameters w are trained on this held out set to reduce overfitting to training data.
Hardware Specification Yes The GPUs used in this research were generously donated by the NVIDIA Corporation.
Software Dependencies No The paper states 'We use the Caffe [9] implementation of DCNN in our experiments.' without providing a specific version number for Caffe or any other software dependencies.
Experiment Setup Yes The DCNN consists of five convolutional layers, 2 max-pooling layers and three fully-connected layers with a final |S| dimensions softmax output... We use dropout, local response normalization (norm) and overlapping pooling (pool) described in [12]. The size of input patch is 36 36 pixels on the LSP dataset, and 54 54 pixels on the FLIC dataset.