Learning the Dynamics of Visual Relational Reasoning via Reinforced Path Routing
Authors: Chenchen Jing, Yunde Jia, Yuwei Wu, Chuanhao Li, Qi Wu1122-1130
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on referring expression comprehension and visual question answering demonstrate the effectiveness of our method. |
| Researcher Affiliation | Academia | 1Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, China 2Australian Centre for Robotic Vision, University of Adelaide, Australia |
| Pseudocode | No | The paper describes the methodology and model architecture but does not include structured pseudocode or algorithm blocks that are clearly labeled or formatted as such. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | Yes | We use two REC datasets: the CLEVR-Ref+ (Liu et al. 2019b) that is a synthetic diagnostic dataset, and the Ref-reasoning (Yang, Li, and Yu 2020) that contains real images. [...] The challenging GQA dataset (Hudson and Manning 2019a) that contains compositional questions about real-world images is used. |
| Dataset Splits | Yes | There are a train split and a val split in the CLEVRRef+ dataset. [...] The GQA dataset (Hudson and Manning 2019a) ... has a train split for training, a test-dev split for validation, and a test split for online testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions software tools like the Spacy tool and various models/detectors, but it does not provide specific version numbers for these or other ancillary software components required for replication. |
| Experiment Setup | Yes | For the Ref-reasoning, the hyper-parameters µ, λ and γ are set as 0.01, 0.5, and 0.01. For the CLEVR-Ref+, the three hyperparameters are set as 0.01, 0.5, and 0.001. The max number of time steps is set as 4 for the Ref-reasoning and 3 for the CLEVR-Ref+. For both datasets, the dimensions of the spatial feature db and the common space d are set as 128, and 512, respectively. |