reproducibilityindex.ai

Learning Statistical Scripts with LSTM Recurrent Neural Networks

Authors: Karl Pichotta, Raymond Mooney

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our system on two tasks, inferring held-out events from text and inferring novel events from text, substantially outperforming prior approaches on both tasks.
Researcher Affiliation	Academia	Karl Pichotta and Raymond J. Mooney {pichotta,mooney}@cs.utexas.edu Department of Computer Science The University of Texas at Austin Austin, TX 78712, USA
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code for the methodology described, nor does it include a link to a code repository.
Open Datasets	Yes	For our corpus, we use English Language Wikipedia,4 breaking articles into paragraphs. Our training set was approximately 8.9 million event sequences, our validation set was approximately 89,000 event sequences, and our test set was 2,000 events from 411 sequences, such that no test-set article is in the training or validation set. ... 4http://en.wikipedia.org/, dump from Jan 2, 2014.
Dataset Splits	Yes	Our training set was approximately 8.9 million event sequences, our validation set was approximately 89,000 event sequences, and our test set was 2,000 events from 411 sequences, such that no test-set article is in the training or validation set.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments. It only mentions training duration.
Software Dependencies	Yes	We use version 3.3.1 of the Stanford Core NLP system. We use the implementation of LSTM provided by the Caffe library (Jia et al. 2014).
Experiment Setup	Yes	Since RNNs are quite sensitive to hyperparameter values (Sutskever et al. 2013), we measured validation set performance in different regions of hyperparameter space, ultimately selecting learning rate η = 0.1, momentum parameter μ = 0.98, LSTM vector length of 1,000, and a Normal N(0, 0.1) distribution for random initialization (biases are initialized to 0). Event component embeddings have dimension 300. We use ℓ2 regularization and Dropout (Hinton et al. 2012) with dropout probability 0.5. We clip gradient updates at 10 to prevent exploding gradients (Pascanu, Mikolov, and Bengio 2013) We damp η by 0.9 every 100,000 iterations. We train for 750,000 batch updates, which took between 50 and 60 hours. We use a beam width of 50 in all beam searches.