Large-Scale Answerer in Questioner's Mind for Visual Dialog Question Generation

Authors: Sang-Woo Lee, Tong Gao, Sohee Yang, Jaejun Yoo, Jung-Woo Ha

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method on Guess Which, a challenging task-oriented visual dialog problem, where the number of candidate classes is near 10K. Our experimental results and ablation studies show that AQM+ outperforms the state-of-the-art models by a remarkable margin with a reasonable approximation.
Researcher Affiliation Industry Sang-Woo Lee, Tong Gao, Sohee Yang, Jaejun Yoo, & Jung-Woo Ha Clova AI Research, NAVER Corp. {sang.woo.lee,tong.gao,sh.yang,jaejun.yoo,jungwoo.ha}@navercorp.com
Pseudocode Yes Algorithm 1 Question Generating Process of AQM+ in Our Guess Which Experiments
Open Source Code Yes Our code is modified from the code of Modhe et al. (2018), and we make our code publically available2. https://github.com/naver/aqm-plus
Open Datasets Yes Guess Which uses Visual Dialog dataset (Das et al., 2017a) which includes human dialogs on MSCOCO images (Lin et al., 2014) as well as the captions that are generated.
Dataset Splits No The paper mentions 'training data' and 'test images' but does not explicitly provide details about training/validation/test splits, such as percentages or sample counts for a distinct validation set.
Hardware Specification Yes We used Tesla P40 for our experiments.
Software Dependencies No The paper mentions the use of 'NAVER Smart Machine Learening (NSML) platform' and that 'Our code is modified from the code of Modhe et al. (2018)', which implies PyTorch use. However, it does not specify exact version numbers for any software dependencies like Python or PyTorch.
Experiment Setup Yes We set |Ct,topk| = |Qt,gen| = |At,topk(qt)| = 20. The epoch for SL-Q is 60. The epoch for RL-Q and RL-QA is 20 for non-delta, and 15 for delta, respectively.