Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"
Authors: Saeed Amizadeh, Hamid Palangi, Alex Polozov, Yichen Huang, Kazuhito Koishida
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we experimentally demonstrate how we can incorporate our framework for evaluating the visual and the reasoning aspects of the VQA in a decoupled manner. To this end, we have performed experiments using our framework and candidate VQA models on the GQA dataset. |
| Researcher Affiliation | Industry | 1Microsoft Applied Sciences Group (ASG), Redmond WA, USA 2Microsoft Research AI, Redmond WA, USA. Correspondence to: Saeed Amizadeh <saamizad@microsoft.com>. |
| Pseudocode | Yes | Algorithm 1 Question answering in DFOL. Input: Question FQ (binary or open), threshold θ if FQ is a binary question then return α(FQ) > θ else Let {a1, . . . , ak} be the plausible answers for FQ return argmax1 i k α(FQ,ai) |
| Open Source Code | Yes | 1The Py Torch code for the -FOL framework is publicly available at https://github.com/microsoft/DFOL-VQA. |
| Open Datasets | Yes | To this end, we use the GQA dataset (Hudson & Manning, 2019b) of multi-step functional visual questions. |
| Dataset Splits | Yes | The GQA dataset consists of 22M questions defined over 130K real-life images. Each image in the Train/Validation splits is accompanied by the scene graph annotation, and each question in the Train/Validation/Test-Dev splits comes with its equivalent program. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models, or cloud computing specifications. |
| Software Dependencies | No | The paper mentions 'Py Torch code' and 'Adam optimizer' but does not specify their version numbers or versions for other software dependencies needed to replicate the experiments. |
| Experiment Setup | Yes | Training setup: For training all of -FOL models, we have used Adam optimizer with learning rate 10 4 and weight decay 10 10. The dropout ratio is set to 0.1. We have also applied gradient clipping with norm 0.65. |