Automatic Generation of Grounded Visual Questions
Authors: Shijie Zhang, Lizhen Qu, Shaodi You, Zhenglu Yang, Jiawan Zhang
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to evaluate our model as well as the most competitive baseline with three kinds of measures adapted from the ones commonly used in the tasks of image caption generation and machine translation. The experimental results on two real world datasets show that our model outperforms the strongest baseline in terms of both correctness and diversity with a wide margin. |
| Researcher Affiliation | Academia | Shijie Zhang #1, Lizhen Qu 2, Shaodi You 3, Zhenglu Yang 4, Jiawan Zhang 5 #School of Computer Science and Technology, Tianjin University, Tianjin, China Data61-CSIRO, Canberra, Australia Australian National University, Canberra, Australia College of Computer and Control Engineering, Nankai University, Tianjin, China The School of Computer Software, Tianjin University, Tianjin, China |
| Pseudocode | No | The paper describes its methods and processes using textual descriptions and mathematical equations, but it does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We conduct our experiments on two datasets: VQA-Dataset [Antol et al., 2015] and Visual7W [Zhu et al., 2015]. The images in those datasets are sampled from the MS-COCO dataset [Lin et al., 2014]. |
| Dataset Splits | No | The paper mentions tuning hyperparameters 'on the validation sets' but does not provide specific details on the dataset splits (e.g., percentages or counts for training, validation, and test sets). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU specifications, or memory. |
| Software Dependencies | No | The paper mentions various models and algorithms used (e.g., Glove, VGG-16, Dense Cap, LSTM, Adam) and cites their original papers, but it does not provide specific version numbers for any software, libraries, or frameworks used in its implementation. |
| Experiment Setup | Yes | We fix the batch size to 64. We set the maximal epochs to 64 for Visual7W and the maximal epochs to 128 for VQA. The corresponding model hyperparameters were tuned on the validation sets. Herein, we set α = 0.75. |