reproducibilityindex.ai

Optimizing Sentence Modeling and Selection for Document Summarization

Authors: Wenpeng Yin, Yulong Pei

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on DUC2002 and DUC2004 benchmark data sets demonstrate the effectiveness of our approach.
Researcher Affiliation	Academia	1The Center for Information and Language Processing, University of Munich wenpeng@cis.uni-muenchen.de 2School of Computer Science, Carnegie Mellon University yulongp@cs.cmu.edu
Pseudocode	Yes	Algorithm 1: Diversiﬁed Selection Algorithm
Open Source Code	No	The paper mentions using external tools like ROUGE (footnote 4) and word2vec (footnote 5) and provides links for those, but it does not provide concrete access to the source code for its own proposed methodology (Div Select+CNNLM).
Open Datasets	Yes	We conduct experiments on the data sets DUC20021 and DUC20042 in which generic multi-document summarization has been one of the fundamental tasks (i.e., task 2 in DUC2002 and task 2 in DUC2004). 1http://www-nlpir.nist.gov/projects/duc/data/2002 data.html 2http://www-nlpir.nist.gov/projects/duc/data/2004 data.html
Dataset Splits	No	The paper uses DUC2002 and DUC2004 benchmark datasets for evaluation, but it does not specify explicit training, validation, and test splits with percentages or sample counts for these datasets within the paper's text.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	Yes	We use the ofﬁcially adopted ROUGE [Lin, 2004] (version 1.5.5) toolkit4 for evaluation.
Experiment Setup	Yes	CNNLM setup. DUC data is relatively small for training a neural network. In experiments, we ﬁrst pre-train CNNLM on one million sentences from English Gigawords [Robert, 2009], then further train it on DUC data to learn representation for each sentence in DUC. Additionally, like some literature did, pre-trained word representations by [Mikolov et al., 2013]5 are used to initialize the input layer of Figure 1 and ﬁne-tuned during training. The ﬁlter width l = 5. Five left context words are used in average layer. For each true example, 10 noise words are sampled in NCE. All words, phrases and sentences have 300-dimensional representations.