Machine Comprehension Using Match-LSTM and Answer Pointer

Authors: Shuohang Wang, Jing Jiang

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that both of our two models substantially outperform the best results obtained by Rajpurkar et al. (2016) using logistic regression and manually crafted features. Besides, our boundary model also achieves the best performance on the MSMARCO dataset (Nguyen et al., 2016). In this section, we present our experiment results and perform some analyses to better understand how our models works.
Researcher Affiliation Academia Shuohang Wang School of Information Systems Singapore Management University shwang.2014@phdis.smu.edu.sg Jing Jiang School of Information Systems Singapore Management University jingjiang@smu.edu.sg
Pseudocode No The paper describes the model architecture and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Besides, we also made our code available online 3. [footnote] 3 https://github.com/shuohangwang/Seq Match Seq
Open Datasets Yes We use the Stanford Question Answering Dataset (SQu AD) v1.1 and the human-generated Microsoft MAchine Reading COmprehension (MSMARCO) dataset v1.1 to conduct our experiments.
Dataset Splits Yes The data has been split into a training set (with 87,599 question-answer pairs), a development set (with 10,570 questionanswer pairs) and a hidden test set. (for SQuAD) and The data has been split into a training set (82326 pairs), a development set (10047 pairs) and a test set (9650 pairs). (for MSMARCO).
Hardware Specification No The paper does not provide specific hardware details such as CPU/GPU models, processors, or memory used for running the experiments.
Software Dependencies No The paper mentions using GloVe for word embeddings and ADAMAX as an optimizer, but does not provide specific version numbers for software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes The dimensionality l of the hidden layers is set to be 150. We use ADAMAX (Kingma & Ba, 2015) with the coefficients β1 = 0.9 and β2 = 0.999 to optimize the model. Each update is computed through a minibatch of 30 instances. We do not use L2-regularization.