Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Multimodal Graph Networks for Compositional Generalization in Visual Question Answering
Authors: Raeid Saqur, Karthik Narasimhan
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate MGN on two tasks a binary classification task of predicting if a caption matches an image based on attribute compositions in the CLEVR dataset [28], and CLOSURE [6] a recently released challenge for testing systematic generalization in language. |
| Researcher Affiliation | Collaboration | 1University of Toronto Computer Science 2Princeton University, Computer Science 3Vector Institute for Artificial Intelligence raeidsaqur@cs.[toronto|princeton].edu Karthik Narasimhan Department of Computer Science Princeton University EMAIL |
| Pseudocode | No | The paper describes processes and architectures but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/raeidsaqur/mgn |
| Open Datasets | Yes | We use images from the CLEVR dataset [28] and use their template generator to produce captions that are both true and false. The original dataset contains 1M questions generated from 100k questions with 90 question template families... |
| Dataset Splits | Yes | All models were trained using Adam with a learning rate of 5 10 4, a batch size of 64 for a maximum of 360k iterations, with early stopping based on validation accuracy. |
| Hardware Specification | No | No specific hardware (e.g., GPU/CPU models, memory details) used for running experiments was mentioned in the paper. |
| Software Dependencies | No | The paper mentions "Py Torch Geometric [13]" and the "en_core_web_sm 3 LM", but does not provide specific version numbers for PyTorch, SpaCy, or PyTorch Geometric itself. |
| Experiment Setup | Yes | All models were trained using Adam with a learning rate of 5 10 4, a batch size of 64 for a maximum of 360k iterations, with early stopping based on validation accuracy. ... A learning rate of 0.01 with weight decay 5 10 4 was used with the cross-entropy loss function. ... Both the encoder and decoder have hidden layers with a 256-dim hidden vector. We set the dimensions of both the encoder and decoder word vectors to be 300, and the multimodal graph vector representation to be 100. ... We use a learning rate of 1 10 5 and a batch size of 64 for a maximum of 1,000,000 iterations. |