Ordered Memory

Authors: Yikang Shen, Shawn Tan, Arian Hosseini, Zhouhan Lin, Alessandro Sordoni, Aaron C. Courville

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that our model achieves strong performance on the logical inference task (Bowman et al., 2015) and the List Ops (Nangia and Bowman, 2018) task. We can also interpret the model to retrieve the induced tree structure, and find that these induced structures align with the ground truth. Finally, we evaluate our model on the Stanford Sentiment Treebank tasks (Socher et al., 2013), and find that it performs comparatively with the state-of-the-art methods in the literature2.
Researcher Affiliation Collaboration Yikang Shen Mila/Universit e de Montr eal and Microsoft Research Montr eal, Canada Shawn Tan Mila/Universit e de Montr eal Montr eal, Canada Arian Hosseini Mila/Universit e de Montr eal and Microsoft Research Montr eal, Canada Zhouhan Lin Mila/Universit e de Montr eal Montr eal, Canada Alessandro Sordoni Microsoft Research Montr eal, Canada Aaron Courville Mila/Universit e de Montr eal Montr eal, Canada
Pseudocode Yes Algorithm 1: Ordered Memory algorithm. The attention function Att( ) is defined in section 3.1. The recursive cell function cell( ) is defined in section 3.2.
Open Source Code Yes 2The code can be found at https://github.com/yikangshen/Ordered-Memory
Open Datasets Yes We evaluate the tree learning capabilities of our model on two datasets: logical inference (Bowman et al., 2015) and List Ops (Nangia and Bowman, 2018). We also evaluate our model on the Stanford Sentiment Treebank tasks (Socher et al., 2013).
Dataset Splits Yes The model is trained on sequences containing up to 6 operations and tested on sequences with higher number (7-12) of operations. Each partitions include a training set filtered out all data points that match the rule indicated in Excluded, and a test set formed by matched data points.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU models, CPU types, memory) used for the experiments.
Software Dependencies No The paper mentions modifying code from 'Annotated Transformer' but does not specify software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes Hyper-parameters can be found in Appendix B. We used the same hidden state size for our model and baselines for proper comparison.