Teaching Machines to Ask Questions

Authors: Kaichun Yao, Libo Zhang, Tiejian Luo, Lili Tao, Yanjun Wu

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our model is trained and evaluated on a question-answering dataset SQu AD, and the experimental results shown the proposed model is able to generate diverse and readable questions with the specific attribute.
Researcher Affiliation Academia 1 University of the Chinese Academy of Sciences 2 Institute of Software Chinese Academy of Sciences 3 University of the West of England
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not explicitly state that open-source code for the described methodology is provided, nor does it provide a link to such code.
Open Datasets Yes We conduct our experiments on the SQu AD dataset [Rajpurkar et al., 2016], which is used for machine reading comprehension and consists of more than 100,000 questions posed by crowd workers on 536 high-Page Rank Wikipedia articles.
Dataset Splits Yes After pre-processing, the extracted training, development and test sets contain 83,889, 5,168 and 5,000 triples respectively.
Hardware Specification No No specific hardware details (e.g., CPU/GPU models, memory) used for running experiments are mentioned in the paper.
Software Dependencies No The paper mentions 'Stanford Core NLP' and 'glove.840B.300d pre-trained embeddings' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes We set the dimension of word embedding to 300 and use the glove.840B.300d pre-trained embeddings [Pennington et al., 2014] for initialization. The LSTM hidden unit size is set to 300 and the number of layers of LSTMs is set to 1 in both the encoder and the decoder. We update the model parameters using stochastic gradient descent with mini-batch size of 64. The learning rate of generator G and discriminator D is set to 0.001, 0.0002, respectively. We clip the gradient when its norm exceeds 5. The scaling factors α and β are set to 0.6 and 0.5. The latent z space size is set to 200. During decoding, we do beam search with a beam size of 3.