reproducibilityindex.ai

Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"

Authors: Saeed Amizadeh, Hamid Palangi, Alex Polozov, Yichen Huang, Kazuhito Koishida

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we experimentally demonstrate how we can incorporate our framework for evaluating the visual and the reasoning aspects of the VQA in a decoupled manner. To this end, we have performed experiments using our framework and candidate VQA models on the GQA dataset.
Researcher Affiliation	Industry	1Microsoft Applied Sciences Group (ASG), Redmond WA, USA 2Microsoft Research AI, Redmond WA, USA. Correspondence to: Saeed Amizadeh <saamizad@microsoft.com>.
Pseudocode	Yes	Algorithm 1 Question answering in DFOL. Input: Question FQ (binary or open), threshold θ if FQ is a binary question then return α(FQ) > θ else Let {a1, . . . , ak} be the plausible answers for FQ return argmax1 i k α(FQ,ai)
Open Source Code	Yes	1The Py Torch code for the -FOL framework is publicly available at https://github.com/microsoft/DFOL-VQA.
Open Datasets	Yes	To this end, we use the GQA dataset (Hudson & Manning, 2019b) of multi-step functional visual questions.
Dataset Splits	Yes	The GQA dataset consists of 22M questions deﬁned over 130K real-life images. Each image in the Train/Validation splits is accompanied by the scene graph annotation, and each question in the Train/Validation/Test-Dev splits comes with its equivalent program.
Hardware Specification	No	The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models, or cloud computing specifications.
Software Dependencies	No	The paper mentions 'Py Torch code' and 'Adam optimizer' but does not specify their version numbers or versions for other software dependencies needed to replicate the experiments.
Experiment Setup	Yes	Training setup: For training all of -FOL models, we have used Adam optimizer with learning rate 10 4 and weight decay 10 10. The dropout ratio is set to 0.1. We have also applied gradient clipping with norm 0.65.