reproducibilityindex.ai

Iterative Scene Graph Generation

Authors: Siddhesh Khandelwal, Leonid Sigal

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments on Visual Genome [30] and Action Genome [25] benchmark datasets we show improved performance on the scene graph generation task.
Researcher Affiliation	Academia	Siddhesh Khandelwal1,2 and Leonid Sigal1,2,3 1Department of Computer Science, University of British Columbia 2Vector Institute for AI 3CIFAR AI Chair {skhandel, lsigal}@cs.ubc.ca
Pseudocode	No	The paper describes the architecture and formulation but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code	Yes	The code is available at github.com/ubc-vision/Iterative SG.
Open Datasets	Yes	Through extensive experiments on Visual Genome [30] and Action Genome [25] benchmark datasets we show improved performance on the scene graph generation task. Visual Genome [30] is licensed under the Creative Commons Attribution 4.0 International License. Action Genome [25] is licensed under the MIT license.
Dataset Splits	No	We use widely popular data splits for our experiments. We brieﬂy describe this in Section 5. The hyperparameters and additional data details are also mentioned in the supplementary (Section B).
Hardware Specification	No	Hardware resources used in preparing this research were provided, in part, by the Province of Ontario, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute1. Additional support was provided by JELF CFI grant and Compute Canada under the RAC award.
Software Dependencies	No	We use Res Net-101 [22] as the backbone network for image feature extraction.
Experiment Setup	Yes	Implementation Details (transformer-based approach). We use Res Net-101 [22] as the backbone network for image feature extraction. Each of the subject, object, and predicate decoders have 6 layers, with a feature size of 256. The decoders use 300 queries. For training we use a batch size of 12 and initial learning rate of 10^4, which is gradually decayed.