Machine Comprehension Using Match-LSTM and Answer Pointer
Authors: Shuohang Wang, Jing Jiang
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that both of our two models substantially outperform the best results obtained by Rajpurkar et al. (2016) using logistic regression and manually crafted features. Besides, our boundary model also achieves the best performance on the MSMARCO dataset (Nguyen et al., 2016). In this section, we present our experiment results and perform some analyses to better understand how our models works. |
| Researcher Affiliation | Academia | Shuohang Wang School of Information Systems Singapore Management University shwang.2014@phdis.smu.edu.sg Jing Jiang School of Information Systems Singapore Management University jingjiang@smu.edu.sg |
| Pseudocode | No | The paper describes the model architecture and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Besides, we also made our code available online 3. [footnote] 3 https://github.com/shuohangwang/Seq Match Seq |
| Open Datasets | Yes | We use the Stanford Question Answering Dataset (SQu AD) v1.1 and the human-generated Microsoft MAchine Reading COmprehension (MSMARCO) dataset v1.1 to conduct our experiments. |
| Dataset Splits | Yes | The data has been split into a training set (with 87,599 question-answer pairs), a development set (with 10,570 questionanswer pairs) and a hidden test set. (for SQuAD) and The data has been split into a training set (82326 pairs), a development set (10047 pairs) and a test set (9650 pairs). (for MSMARCO). |
| Hardware Specification | No | The paper does not provide specific hardware details such as CPU/GPU models, processors, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions using GloVe for word embeddings and ADAMAX as an optimizer, but does not provide specific version numbers for software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | The dimensionality l of the hidden layers is set to be 150. We use ADAMAX (Kingma & Ba, 2015) with the coefficients β1 = 0.9 and β2 = 0.999 to optimize the model. Each update is computed through a minibatch of 30 instances. We do not use L2-regularization. |