Generating Distractors for Reading Comprehension Questions from Real Examinations

Authors: Yifan Gao, Lidong Bing, Piji Li, Irwin King, Michael R. Lyu6423-6430

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our framework on a distractor generation dataset prepared with the RACE (Lai et al. 2017) dataset. ... The results show that our proposed model beats several baselines and ablations. Human evaluations show that distractors generated by our model are more likely to confuse the examinees, which demonstrates the functionality of our generated distractors in real examinations.
Researcher Affiliation Collaboration 1Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong 2R&D Center Singapore, Machine Intelligence Technology, Alibaba DAMO Academy 3Tencent AI Lab
Pseudocode No The paper describes its model architecture in text and with a diagram, but it does not include any pseudocode or algorithm blocks.
Open Source Code Yes Our code and data are available at https://github.com/Evan-Gao/Distractor-Generation-RACE
Open Datasets Yes We evaluate our framework on a distractor generation dataset prepared with the RACE (Lai et al. 2017) dataset. ... Our code and data are available at https://github.com/Evan-Gao/Distractor-Generation-RACE
Dataset Splits Yes We randomly divide the dataset into the training (80%), validation (10%) and testing sets (10%).
Hardware Specification No The paper does not provide any specific hardware details such as CPU, GPU models, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions using GloVe word embeddings but does not specify version numbers for any software, libraries, or frameworks used in the implementation (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup Yes The bidirectional LSTMs hidden unit size is set to 500 (250 for each direction). ... The hyperparameters λq and λa in static attention are initialized as 1.0 and 1.5 respectively. We use dropout with probability p = 0.3. ... We use stochastic gradient descent (SGD) as the optimizer with a minibatch size of 32 and the initial learning rate 1.0 for all baselines and our model. We train the model for 100k steps and start halving the learning rate at step 50k, then we halve the learning rate every 10k steps till ending. We set the gradient norm upper bound to 5 during the training. ... we set the maximum length for output sequence as 15 and block unigram repeated token, the beam size k is set to 50.