Ranking with Recursive Neural Networks and Its Application to Multi-Document Summarization

Authors: Ziqiang Cao, Furu Wei, Li Dong, Sujian Li, Ming Zhou

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on the DUC 2001, 2002 and 2004 multi-document summarization datasets show that R2N2 outperforms state-of-the-art extractive summarization approaches.
Researcher Affiliation Collaboration Ziqiang Cao1 Furu Wei2 Li Dong3 Sujian Li1 Ming Zhou2 1Key Laboratory of Computational Linguistics, Peking University, MOE, China 2Microsoft Research, Beijing, China 3Beihang University, Beijing, China
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about releasing its own source code, nor does it include a link to a code repository for the methodology described.
Open Datasets Yes Here we focus on the generic multi-document summarization task, which was carried out in DUC 2001, 2002 and 2004. The documents are all from the news domain and are grouped into various thematic clusters. Table 2 shows the size of the three datasets and the maximum length of summaries for each task. Footnotes also provide URLs: DUC2 (http://duc.nist.gov/) and TAC3 (http://www.nist.gov/tac/).
Dataset Splits No The paper states, 'We train the model on two years data and test it on the other year.' This describes a train/test setup but does not specify a separate validation dataset split or percentages used for hyperparameter tuning.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU or CPU models, memory) used to run the experiments.
Software Dependencies Yes We use the Stanford Core NLP (Manning et al. 2014) to parse and convert a sentence into a binary tree... We use IBM CPLEX Optimizer 1 in this paper... We use LIBLINEAR... ROUGE4 (Lin 2004). It has grown up to be a standard automatic evaluation metric for DUC since 2004. The parameter of length constraint is -l 100 for DUC 2001/2002, and -b 665 for DUC 2004. We take ROUGE-2 recall as the main metric for comparison due to its high capability of evaluating automatic summarization systems (Owczarzak et al. 2012). ROUGE-1.5.5 with options: -n 2 -m -u -c 95 -x -r 1000 -f A -p 0.5 -t 0.
Experiment Setup Yes We set the dimension of RNN (kh) to 8, which is a half of the input layer dimension. We use mini-batch gradient descent with L2-norm regularization to update weights. The learning rate is 0.005 and regularization factor is 0.1. We set the batch size to 100. The training process with 100 iterations spends about 30 minutes.