Reinforced Mnemonic Reader for Machine Reading Comprehension

Authors: Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, Ming Zhou

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the Stanford Question Answering Dataset (SQu AD) show that our model achieves state-of-the-art results.
Researcher Affiliation Collaboration College of Computer, National University of Defense Technology, Changsha, China School of Computer Science, Fudan University, Shanghai, China Microsoft Research, Beijing, China
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We mainly focus on the SQu AD dataset [Rajpurkar et al., 2016] to train and evaluate our model.
Dataset Splits Yes We mainly focus on the SQu AD dataset [Rajpurkar et al., 2016] to train and evaluate our model. ...until the F1 score on the development set no longer improves.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions using GloVe and ELMo embeddings but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We use the Adam optimizer [Kingma and Ba, 2014] for both ML and DCRL training. The initial learning rates are 0.0008 and 0.0001 respectively, and are halved whenever meeting a bad iteration. The batch size is 48 and a dropout rate [Srivastava et al., 2014] of 0.3 is used to prevent overfitting. Word embeddings remain fixed during training. For out of vocabulary words, we set the embeddings from Gaussian distributions and keep them trainable. The size of character embedding and corresponding LSTMs is 50, the main hidden size is 100, and the hyperparameter γ is 3.