Associative Embedding: End-to-End Learning for Joint Detection and Grouping
Authors: Alejandro Newell, Zhiao Huang, Jia Deng
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show how to apply this method to multi-person pose estimation and report state-of-the-art performance on the MPII and MS-COCO datasets. |
| Researcher Affiliation | Academia | Alejandro Newell Computer Science and Engineering University of Michigan Ann Arbor, MI alnewell@umich.edu Zhiao Huang* Institute for Interdisciplinary Information Sciences Tsinghua University Beijing, China hza14@mails.tsinghua.edu.cn Jia Deng Computer Science and Engineering University of Michigan Ann Arbor, MI jiadeng@umich.edu |
| Pseudocode | No | The paper describes the method in prose and uses diagrams but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using TensorFlow but does not state that its own source code is open or provide a link to its implementation. |
| Open Datasets | Yes | We evaluate on two datasets: MS-COCO [27] and MPII Human Pose [3]. MPII Human Pose consists of about 25k images and contains around 40k total annotated people (three-quarters of which are available for training). MS-COCO [27] consists of around 60K training images with more than 100K people with annotated keypoints. |
| Dataset Splits | Yes | MPII Human Pose consists of about 25k images and contains around 40k total annotated people (three-quarters of which are available for training). MS-COCO [27] consists of around 60K training images with more than 100K people with annotated keypoints. We report performance on two test sets, a development test set (test-dev) and a standard test set (test-std). |
| Hardware Specification | No | The paper mentions using TensorFlow but does not specify any CPU, GPU models, or other hardware specifications used for running the experiments. |
| Software Dependencies | No | We train the network using... Tensorflow [2]. No specific version number is provided for TensorFlow or other software dependencies. |
| Experiment Setup | Yes | The network used for this task consists of four stacked hourglass modules, with an input size of 512 512 and an output resolution of 128 128. We train the network using a batch size of 32 with a learning rate of 2e-4 (dropped to 1e-5 after about 150k iterations) using Tensorflow [2]. The associative embedding loss is weighted by a factor of 1e-3 relative to the MSE loss of the detection heatmaps. |