reproducibilityindex.ai

Unit Dependency Graph and Its Application to Arithmetic Word Problem Solving

Authors: Subhro Roy, Dan Roth

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present Summa Ru NNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to state-of-the-art. Our model has the additional advantage of being very interpretable, since it allows visualization of its predictions broken up by abstract features such as information content, salience and novelty. Another novel contribution of our work is abstractive training of our extractive model that can train on human generated reference summaries alone, eliminating the need for sentence-level extractive labels.
Researcher Affiliation	Industry	Ramesh Nallapati, Feifei Zhai, Bowen Zhou nallapati@us.ibm.com, ffzhai2012@gmail.com, zhou@us.ibm.com IBM Watson 1011 Kitchawan Road, Yorktown Heights, NY 10598
Pseudocode	No	The paper provides mathematical equations and a graphical representation (Figure 1), but no explicit pseudocode block or algorithm listing.
Open Source Code	No	The paper references `https://github.com/deepmind/rc-data` for the CNN/Daily Mail corpus, which is a dataset. It does not provide a link or explicit statement about releasing the source code for the Summa Ru NNer model or methodology described in the paper.
Open Datasets	Yes	For our experiments, we used the CNN/Daily Mail corpus originally constructed by (Hermann et al. 2015) for the task of passage-based question answering, and re-purposed for the task of document summarization as proposed in (Cheng and Lapata 2016) for extractive summarization and (Nallapati et al. 2016) for abstractive summarization. 1https://github.com/deepmind/rc-data We also used the DUC 2002 single-document summarization dataset4 consisting of 567 documents as an additional out-of-domain test set to evaluate our models. 4http://www-nlpir.nist.gov/projects/duc/guidelines/2002.html
Dataset Splits	Yes	Overall, we have 196,557 training documents, 12,147 validation documents and 10,396 test documents from the Daily Mail corpus. If we also include the CNN subset, we have 286,722 training documents, 13,362 validation documents and 11,480 test documents.
Hardware Specification	No	The paper states 'large size makes it attractive for training deep neural networks such as ours, with several thousands of parameters' but does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	We used 100-dimensional word2vec (Mikolov et al. 2013) embeddings trained on the CNN/Daily Mail corpus as our embedding initialization. We used a batch size of 64 at training time, and adadelta (Zeiler 2012) to train our model. The paper mentions word2vec and adadelta, but does not provide specific version numbers for these or any other software dependencies crucial for replication.
Experiment Setup	Yes	We used 100-dimensional word2vec (Mikolov et al. 2013) embeddings trained on the CNN/Daily Mail corpus as our embedding initialization. We limited the vocabulary size to 150K and the maximum number of sentences per document to 100, and the maximum sentence length to 50 words, to speed up computation. We ﬁxed the model hidden state size at 200. We used a batch size of 64 at training time, and adadelta (Zeiler 2012) to train our model. We employed gradient clipping to regularize our model and an early stopping criterion based on validation cost.