Visual Reasoning by Progressive Module Networks
Authors: Seung Wook Kim, Makarand Tapaswi, Sanja Fidler
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our model in learning a set of visual reasoning tasks, and demonstrate improved performances in all tasks by learning progressively. By evaluating the reasoning process using human judges, we show that our model is more interpretable than an attention-based baseline. |
| Researcher Affiliation | Collaboration | Seung Wook Kim1,2 , Makarand Tapaswi1,2 , Sanja Fidler1,2,3 1Department of Computer Science, University of Toronto 2Vector Institute, Canada 3NVIDIA {seung,makarand,fidler}@cs.toronto.edu |
| Pseudocode | Yes | Algorithm 1 Computation performed by our Progressive Module Network, for one module Mn |
| Open Source Code | No | The paper does not provide any statements about the availability of its source code or links to a code repository for the implemented methodology. |
| Open Datasets | Yes | We conduct experiments on three datasets (see Appendix B.1 for details): Visual Genome (VG) (Krishna et al., 2016), VQA 2.0 (Goyal et al., 2017), MS-COCO (Lin et al., 2014). |
| Dataset Splits | Yes | We train and validate the relationship detection module using 200K/38K train/val tuples |
| Hardware Specification | No | The paper acknowledges NVIDIA for a donation of GPUs but does not specify the models or any other hardware components (CPU, memory, etc.) used for running the experiments. |
| Software Dependencies | No | The paper mentions various components and models (e.g., GRU, Faster R-CNN, GloVe) but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | Parameters are updated using the Adam optimizer at learning rate of 0.0005 with batch size 64 for 20 epochs. (Image Captioning) Adam optimizer at learning rate of 0.0005 with batch size 128 for 20 epochs. (Relationship Detection) Adam optimizer with learning rate of 0.0001 with batch size 128 for 20 epochs. (Counting) Adam optimizer with learning rate of 0.0005 with batch size 128 for 7 epochs. (VQA) |