Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Teaching Machines to Ask Questions
Authors: Kaichun Yao, Libo Zhang, Tiejian Luo, Lili Tao, Yanjun Wu
IJCAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our model is trained and evaluated on a question-answering dataset SQu AD, and the experimental results shown the proposed model is able to generate diverse and readable questions with the specific attribute. |
| Researcher Affiliation | Academia | 1 University of the Chinese Academy of Sciences 2 Institute of Software Chinese Academy of Sciences 3 University of the West of England |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not explicitly state that open-source code for the described methodology is provided, nor does it provide a link to such code. |
| Open Datasets | Yes | We conduct our experiments on the SQu AD dataset [Rajpurkar et al., 2016], which is used for machine reading comprehension and consists of more than 100,000 questions posed by crowd workers on 536 high-Page Rank Wikipedia articles. |
| Dataset Splits | Yes | After pre-processing, the extracted training, development and test sets contain 83,889, 5,168 and 5,000 triples respectively. |
| Hardware Specification | No | No specific hardware details (e.g., CPU/GPU models, memory) used for running experiments are mentioned in the paper. |
| Software Dependencies | No | The paper mentions 'Stanford Core NLP' and 'glove.840B.300d pre-trained embeddings' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | We set the dimension of word embedding to 300 and use the glove.840B.300d pre-trained embeddings [Pennington et al., 2014] for initialization. The LSTM hidden unit size is set to 300 and the number of layers of LSTMs is set to 1 in both the encoder and the decoder. We update the model parameters using stochastic gradient descent with mini-batch size of 64. The learning rate of generator G and discriminator D is set to 0.001, 0.0002, respectively. We clip the gradient when its norm exceeds 5. The scaling factors α and β are set to 0.6 and 0.5. The latent z space size is set to 200. During decoding, we do beam search with a beam size of 3. |