EQG-RACE: Examination-Type Question Generation
Authors: Xin Jia, Wenjie Zhou, Xu Sun, Yunfang Wu13143-13151
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show a state-of-the-art performance of EQG-RACE, which is apparently superior to the baselines. |
| Researcher Affiliation | Academia | MOE Key Lab of Computational Linguistics, School of EECS, Peking University {jemmryx, wjzhou013, xusun, wuyf}@pku.edu.cn |
| Pseudocode | No | The paper describes the model components and their interaction but does not present any pseudocode blocks or algorithms labeled as such. |
| Open Source Code | Yes | We will make our data and code publicly available for further research. Data and code available at: https://github.com/jemmryx/EQGRACE |
| Open Datasets | Yes | We clean the RACE dataset and maintain Specific-style questions to construct an examination-type QG dataset. Data and code available at: https://github.com/jemmryx/EQGRACE |
| Dataset Splits | Yes | The original RACE contains 87,866, 4,887 and 4,934 samples for training, development and testing, respectively. After filtering, the EQG-RACE contains 18,501, 1,035 and 950 <passage, answer, question> triples |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like NLTK, Stanford Core NLP, Adam optimizer, GloVe, BERT, and ELMo, but does not provide specific version numbers for any of these software dependencies. |
| Experiment Setup | Yes | In our model, the LSTM hidden sizes of encoder and decoder, the word embedding size and the GCN hidden size are all 300. We set the vocabulary to the most frequent 45,000 words. The maximum lengths of input passage and output question are 400 and 30, respectively. We use pre-trained GloVe embedding as initialization of word embedding and fine-tune it during training. We employ Adam as optimizer with a learning rate 0.001 during training. The dropout rate of both encoder and GCN is set to 0.3. In decoding, the beam search size is 10. |