Automatic Rule Extraction from Long Short Term Memory Networks

Authors: W. James Murdoch, Arthur Szlam

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now present the results of our experiments.
Researcher Affiliation Collaboration W. James Murdoch Department of Statistics UC Berkeley Berkeley, CA 94709, USA jmurdoch@berkeley.edu Arthur Szlam Facebook AI Research New York City, NY, 10003 aszlam@fb.com
Pseudocode No The paper provides mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating the public release of its source code for the described methodology.
Open Datasets Yes Originally introduced in Zhang et al. (2015), the Yelp review polarity dataset was obtained from the Yelp Dataset Challenge and has train and test sets of size 560,000 and 38,000. We also used the binary classification task from the Stanford Sentiment Treebank (SST) Socher et al. (2013), which has less data with train/dev/test sizes of 6920/872/1821. Wiki Movies is a dataset consisting of more than 100,000 questions about movies, paired with relevant Wikipedia articles. It was constructed using the pre-existing dataset Movie Lens, paired with templates extracted from the Simple Questions dataset Bordes et al. (2015).
Dataset Splits Yes We use the pre-defined splits into train, validation and test sets, containing 96k, 10k and 10k questions, respectively. ... train/dev/test sizes of 6920/872/1821
Hardware Specification No No specific hardware details (such as GPU/CPU models, memory, or cloud instances) are provided for running the experiments. The paper only mentions that models were implemented in Torch.
Software Dependencies No The paper mentions implementing models in "Torch" and optimizing with "Adam Kingma & Ba (2015)" but does not provide specific version numbers for these or any other software components.
Experiment Setup Yes The word and hidden representations of the LSTM were both set to dimension 200 for Wiki Movies, 300 and 512 for Yelp, and 300 and 150 for Stanford Sentiment Treebank. All models were optimized using Adam Kingma & Ba (2015) with the default learning rate of 0.001 using early stopping on the validation set.