A Compare-Aggregate Model for Matching Text Sequences
Authors: Shuohang Wang, Jing Jiang
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate our model on four different datasets representing different tasks. The first three datasets are question answering tasks while the last one is on textual entailment. The statistics of the four datasets are shown in Table 2. We will fist introduce the task settings and the way we customize the compare-aggregate structure to each task. Then we will show the baselines for the different datasets. Finally, we discuss the experiment results shown in Table 3 and the ablation study shown in Table 4. |
| Researcher Affiliation | Academia | Shuohang Wang School of Information Systems Singapore Management University shwang.2014@phdis.smu.edu.sg Jing Jiang School of Information Systems Singapore Management University jingjiang@smu.edu.sg |
| Pseudocode | No | The paper describes the model architecture and components using text and mathematical equations, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We have also made our code available online.1 Footnote 1: https://github.com/shuohangwang/Seq Match Seq |
| Open Datasets | Yes | We present a model that follows this general framework and test it on four different datasets, namely, Movie QA, Insurance QA, Wiki QA and SNLI. ... For the machine comprehension task Movie QA (Tapaswi et al., 2016)... For the SNLI (Bowman et al., 2015) dataset... For the Insurance QA (Feng et al., 2015) dataset... For the Wiki QA (Yang et al., 2015) datasets... |
| Dataset Splits | Yes | The statistics of the four datasets are shown in Table 2. ... Table 2: The statistics of different datasets. Q:question/hypothesis, C:candidate answers for each question, A:answer/hypothesis, P:plot, w:word (average). (Includes columns for 'train', 'dev', and 'test' for each dataset). |
| Hardware Specification | No | The paper describes hyper-parameters and software dependencies but does not specify any details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using GloVe embeddings and ADAMAX optimizer with specific coefficients but does not provide specific version numbers for any programming languages, libraries, or frameworks (e.g., Python, TensorFlow, PyTorch). |
| Experiment Setup | Yes | The implementation details of the modes are as follows. The word embeddings are initialized from GloVe (Pennington et al., 2014). During training, they are not updated. The word embeddings not found in GloVe are initialized with zero. The dimensionality l of the hidden layers is set to be 150. We use ADAMAX (Kingma & Ba, 2015) with the coefficients β1 = 0.9 and β2 = 0.999 to optimize the model. We do not use L2regularization. The main parameter we tuned is the dropout on the embedding layer. For Wiki QA, which is a relatively small dataset, we also tune the learning rate and the batch size. For the others, we set the batch size to be 30 and the learning rate 0.002. |