reproducibilityindex.ai

Ranking with Recursive Neural Networks and Its Application to Multi-Document Summarization

Authors: Ziqiang Cao, Furu Wei, Li Dong, Sujian Li, Ming Zhou

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the DUC 2001, 2002 and 2004 multi-document summarization datasets show that R2N2 outperforms state-of-the-art extractive summarization approaches.
Researcher Affiliation	Collaboration	Ziqiang Cao1 Furu Wei2 Li Dong3 Sujian Li1 Ming Zhou2 1Key Laboratory of Computational Linguistics, Peking University, MOE, China 2Microsoft Research, Beijing, China 3Beihang University, Beijing, China
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing its own source code, nor does it include a link to a code repository for the methodology described.
Open Datasets	Yes	Here we focus on the generic multi-document summarization task, which was carried out in DUC 2001, 2002 and 2004. The documents are all from the news domain and are grouped into various thematic clusters. Table 2 shows the size of the three datasets and the maximum length of summaries for each task. Footnotes also provide URLs: DUC2 (http://duc.nist.gov/) and TAC3 (http://www.nist.gov/tac/).
Dataset Splits	No	The paper states, 'We train the model on two years data and test it on the other year.' This describes a train/test setup but does not specify a separate validation dataset split or percentages used for hyperparameter tuning.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU or CPU models, memory) used to run the experiments.
Software Dependencies	Yes	We use the Stanford Core NLP (Manning et al. 2014) to parse and convert a sentence into a binary tree... We use IBM CPLEX Optimizer 1 in this paper... We use LIBLINEAR... ROUGE4 (Lin 2004). It has grown up to be a standard automatic evaluation metric for DUC since 2004. The parameter of length constraint is -l 100 for DUC 2001/2002, and -b 665 for DUC 2004. We take ROUGE-2 recall as the main metric for comparison due to its high capability of evaluating automatic summarization systems (Owczarzak et al. 2012). ROUGE-1.5.5 with options: -n 2 -m -u -c 95 -x -r 1000 -f A -p 0.5 -t 0.
Experiment Setup	Yes	We set the dimension of RNN (kh) to 8, which is a half of the input layer dimension. We use mini-batch gradient descent with L2-norm regularization to update weights. The learning rate is 0.005 and regularization factor is 0.1. We set the batch size to 100. The training process with 100 iterations spends about 30 minutes.