Improving Neural Question Generation Using Answer Separation

Authors: Yanghoon Kim, Hwanhee Lee, Joongbo Shin, Kyomin Jung6602-6609

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that our answer separation method significantly reduces the number of improper questions which include answers. Consequently, our model significantly outperforms previous state-of-the-art NQG models.
Researcher Affiliation Academia Yanghoon Kim,1,2 Hwanhee Lee,1 Joongbo Shin,1 Kyomin Jung1,2 1Seoul National University, Seoul, Korea 2Automation and Systems Research Institute, Seoul National University, Seoul, Korea
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for the source code of the described methodology.
Open Datasets Yes For fair comparison, we use the same dataset that is used by previous works (Du, Shao, and Cardie 2017; Zhou et al. 2017; Song et al. 2018): two processed versions of SQu AD(Rajpurkar et al. 2016) dataset.
Dataset Splits Yes As a result, data split-1 and data split-2 contains 70,484/10,570/11,877 triplets and 86,635/8,965/8,964 triplets respectively.
Hardware Specification Yes We implement our models in Tensorflow 1.4 and train the model with a single GTX 1080 Ti.
Software Dependencies Yes We implement our models in Tensorflow 1.4
Experiment Setup Yes The number of hidden units in both encoders and the decoder are 350. For both encoder and decoder, we use 34k most frequent words appeared in training corpus, replacing the rest with the <UNK> token. Weight normalization is applied to the attention module and dropout with Pdrop = 0.4 is applied for both RNNs and the attention module. The layer size of keywordnet is set as 4. During training, we optimize the cross-entropy loss function with the gradient descent algorithm using Adam (Kingma and Ba 2014) optimizer, with an initial learning rate of 0.001. The mini-batch size for each update is set as 128 and the model is trained for up to 17 epochs. When testing, we conduct beam search with beam width 10 and length penalty weight 2.1.