Visual Reasoning by Progressive Module Networks

Authors: Seung Wook Kim, Makarand Tapaswi, Sanja Fidler

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our model in learning a set of visual reasoning tasks, and demonstrate improved performances in all tasks by learning progressively. By evaluating the reasoning process using human judges, we show that our model is more interpretable than an attention-based baseline.
Researcher Affiliation Collaboration Seung Wook Kim1,2 , Makarand Tapaswi1,2 , Sanja Fidler1,2,3 1Department of Computer Science, University of Toronto 2Vector Institute, Canada 3NVIDIA {seung,makarand,fidler}@cs.toronto.edu
Pseudocode Yes Algorithm 1 Computation performed by our Progressive Module Network, for one module Mn
Open Source Code No The paper does not provide any statements about the availability of its source code or links to a code repository for the implemented methodology.
Open Datasets Yes We conduct experiments on three datasets (see Appendix B.1 for details): Visual Genome (VG) (Krishna et al., 2016), VQA 2.0 (Goyal et al., 2017), MS-COCO (Lin et al., 2014).
Dataset Splits Yes We train and validate the relationship detection module using 200K/38K train/val tuples
Hardware Specification No The paper acknowledges NVIDIA for a donation of GPUs but does not specify the models or any other hardware components (CPU, memory, etc.) used for running the experiments.
Software Dependencies No The paper mentions various components and models (e.g., GRU, Faster R-CNN, GloVe) but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes Parameters are updated using the Adam optimizer at learning rate of 0.0005 with batch size 64 for 20 epochs. (Image Captioning) Adam optimizer at learning rate of 0.0005 with batch size 128 for 20 epochs. (Relationship Detection) Adam optimizer with learning rate of 0.0001 with batch size 128 for 20 epochs. (Counting) Adam optimizer with learning rate of 0.0005 with batch size 128 for 7 epochs. (VQA)