Pixels to Graphs by Associative Embedding

Authors: Alejandro Newell, Jia Deng

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark on the Visual Genome dataset, and demonstrate state-of-the-art performance on the challenging task of scene graph generation.
Researcher Affiliation Academia Alejandro Newell Jia Deng Computer Science and Engineering University of Michigan, Ann Arbor {alnewell, jiadeng}@umich.edu
Pseudocode No No pseudocode or clearly labeled algorithm block was found in the paper.
Open Source Code No No explicit statement about releasing source code or a link to a code repository was found.
Open Datasets Yes We evaluate the performance of our method on the Visual Genome dataset [14]. Visual Genome consists of 108,077 images annotated with object detections and object-object relationships, and it serves as a challenging benchmark for scene graph generation on real world images.
Dataset Splits No The paper states, "We use the same categories, as well as the same training and test split as defined by the authors [26]", but does not provide specific percentages or counts for a validation split within the text.
Hardware Specification No No specific hardware details (like GPU models, CPU models, or memory specifications) used for running experiments were mentioned in the paper.
Software Dependencies No We train a stacked hourglass architecture [21] in TensorFlow [1]. (While TensorFlow is mentioned, a specific version number is not provided, nor are other software dependencies with versions.)
Experiment Setup Yes The input to the network is a 512x512 image, with an output resolution of 64x64. ... doubling the number of features to 512 at the two lowest resolutions of the hourglass. The output feature length f is 256. All losses classification, bounding box regression, associative embedding are weighted equally throughout the course of training. We set so = 3 and sr = 6 which is sufficient to completely accommodate the detection annotations for all but a small fraction of cases.