Learning to Count Objects in Natural Images for Visual Question Answering
Authors: Yan Zhang, Jonathon Hare, Adam PrĂ¼gel-Bennett
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. |
| Researcher Affiliation | Academia | Yan Zhang & Jonathon Hare & Adam Pr ugel-Bennett Department of Electronics and Computer Science University of Southampton {yz5n12,jsh2,apb}@ecs.soton.ac.uk |
| Pseudocode | No | The paper describes the algorithmic steps and equations within the text, but does not present a formal pseudocode block or a clearly labeled algorithm figure. |
| Open Source Code | Yes | Our implementation is available at https://github.com/Cyanogenoid/vqa-counting. |
| Open Datasets | Yes | On the number category of the VQA v2 Open-Ended dataset (Goyal et al., 2017), a relatively simple baseline model using the counting component outperforms all previous models... |
| Dataset Splits | Yes | The model is trained for 100 epochs (1697 iterations per epoch to train on the training set, 2517 iterations per epoch to train on both training and validation sets) instead of 100,000 iterations, roughly in line with the doubling of dataset size when going from VQA v1 to VQA v2. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper mentions several techniques and components (e.g., Adam, LSTM, GRU, Batch Normalization) and cites their original papers, but does not specify software versions for programming languages, libraries, or frameworks used (e.g., Python version, PyTorch/TensorFlow version). |
| Experiment Setup | Yes | They are trained with crossentropy loss for 1000 iterations using Adam (Kingma & Ba, 2015) with a learning rate of 0.01 and a batch size of 1024. The learning rate is increased from 0.001 to 0.0015 and the batch size is doubled to 256. The model is trained for 100 epochs (1697 iterations per epoch to train on the training set, 2517 iterations per epoch to train on both training and validation sets) instead of 100,000 iterations... |