Ordered Memory
Authors: Yikang Shen, Shawn Tan, Arian Hosseini, Zhouhan Lin, Alessandro Sordoni, Aaron C. Courville
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that our model achieves strong performance on the logical inference task (Bowman et al., 2015) and the List Ops (Nangia and Bowman, 2018) task. We can also interpret the model to retrieve the induced tree structure, and find that these induced structures align with the ground truth. Finally, we evaluate our model on the Stanford Sentiment Treebank tasks (Socher et al., 2013), and find that it performs comparatively with the state-of-the-art methods in the literature2. |
| Researcher Affiliation | Collaboration | Yikang Shen Mila/Universit e de Montr eal and Microsoft Research Montr eal, Canada Shawn Tan Mila/Universit e de Montr eal Montr eal, Canada Arian Hosseini Mila/Universit e de Montr eal and Microsoft Research Montr eal, Canada Zhouhan Lin Mila/Universit e de Montr eal Montr eal, Canada Alessandro Sordoni Microsoft Research Montr eal, Canada Aaron Courville Mila/Universit e de Montr eal Montr eal, Canada |
| Pseudocode | Yes | Algorithm 1: Ordered Memory algorithm. The attention function Att( ) is defined in section 3.1. The recursive cell function cell( ) is defined in section 3.2. |
| Open Source Code | Yes | 2The code can be found at https://github.com/yikangshen/Ordered-Memory |
| Open Datasets | Yes | We evaluate the tree learning capabilities of our model on two datasets: logical inference (Bowman et al., 2015) and List Ops (Nangia and Bowman, 2018). We also evaluate our model on the Stanford Sentiment Treebank tasks (Socher et al., 2013). |
| Dataset Splits | Yes | The model is trained on sequences containing up to 6 operations and tested on sequences with higher number (7-12) of operations. Each partitions include a training set filtered out all data points that match the rule indicated in Excluded, and a test set formed by matched data points. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU models, CPU types, memory) used for the experiments. |
| Software Dependencies | No | The paper mentions modifying code from 'Annotated Transformer' but does not specify software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | Hyper-parameters can be found in Appendix B. We used the same hidden state size for our model and baselines for proper comparison. |