reproducibilityindex.ai

Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction

Authors: Roei Herzig, Moshiko Raboh, Gal Chechik, Jonathan Berant, Amir Globerson

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate our approach, we ﬁrst demonstrate on a synthetic dataset that respecting permutation invariance is important, because models that violate this invariance need more training data, despite having a comparable model size. Then, we tackle the problem of scene graph generation. We describe a model that satisﬁes the permutation invariance property, and show that it achieves state-of-the-art results on the competitive Visual Genome benchmark [15], demonstrating the power of our new design principle.
Researcher Affiliation	Collaboration	Roei Herzig Tel Aviv University roeiherzig@mail.tau.ac.il Moshiko Raboh Tel Aviv University mosheraboh@mail.tau.ac.il Gal Chechik Bar-Ilan University, NVIDIA Research gal.chechik@biu.ac.il Jonathan Berant Tel Aviv University, AI2 joberant@cs.tau.ac.il Amir Globerson Tel Aviv University gamir@post.tau.ac.il
Pseudocode	No	The paper includes mathematical equations and a schematic diagram (Figure 2) to describe the architecture, but it does not provide any structured pseudocode or an algorithm block.
Open Source Code	Yes	The full code is available at https://github.com/shikorab/SceneGraph
Open Datasets	Yes	We evaluated our approach on Visual Genome (VG) [15], a dataset with 108,077 images annotated with bounding boxes, entities and relations. ... [15] Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A Shamma, et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 123(1):32 73, 2017.
Dataset Splits	Yes	To tune hyper-parameters, we also split the training data into two by randomly selecting 5K examples, resulting in a ﬁnal 70K/5K/32K split for train/validation/test sets.
Hardware Specification	No	The paper does not specify the hardware used for running the experiments, such as specific GPU models, CPU types, or memory.
Software Dependencies	No	The paper mentions training 'using Adam [14]' but does not provide specific software or library names with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other dependencies).
Experiment Setup	Yes	All networks were trained using Adam [14] with batch size 20. ... In the loss, we penalized entities 4 times more strongly than relations, and penalized negative relations 10 times more weakly than positive relations. ... The φ and α networks were each implemented as a single fully-connected (FC) layer with a 500-dimensional outputs. ρ was implemented as a FC network with 3 500dimensional hidden layers, with one 150-dimensional output for the entity probabilities, and one 51-dimensional output for relation probabilities.