Ranking with Recursive Neural Networks and Its Application to Multi-Document Summarization
Authors: Ziqiang Cao, Furu Wei, Li Dong, Sujian Li, Ming Zhou
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the DUC 2001, 2002 and 2004 multi-document summarization datasets show that R2N2 outperforms state-of-the-art extractive summarization approaches. |
| Researcher Affiliation | Collaboration | Ziqiang Cao1 Furu Wei2 Li Dong3 Sujian Li1 Ming Zhou2 1Key Laboratory of Computational Linguistics, Peking University, MOE, China 2Microsoft Research, Beijing, China 3Beihang University, Beijing, China |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing its own source code, nor does it include a link to a code repository for the methodology described. |
| Open Datasets | Yes | Here we focus on the generic multi-document summarization task, which was carried out in DUC 2001, 2002 and 2004. The documents are all from the news domain and are grouped into various thematic clusters. Table 2 shows the size of the three datasets and the maximum length of summaries for each task. Footnotes also provide URLs: DUC2 (http://duc.nist.gov/) and TAC3 (http://www.nist.gov/tac/). |
| Dataset Splits | No | The paper states, 'We train the model on two years data and test it on the other year.' This describes a train/test setup but does not specify a separate validation dataset split or percentages used for hyperparameter tuning. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU or CPU models, memory) used to run the experiments. |
| Software Dependencies | Yes | We use the Stanford Core NLP (Manning et al. 2014) to parse and convert a sentence into a binary tree... We use IBM CPLEX Optimizer 1 in this paper... We use LIBLINEAR... ROUGE4 (Lin 2004). It has grown up to be a standard automatic evaluation metric for DUC since 2004. The parameter of length constraint is -l 100 for DUC 2001/2002, and -b 665 for DUC 2004. We take ROUGE-2 recall as the main metric for comparison due to its high capability of evaluating automatic summarization systems (Owczarzak et al. 2012). ROUGE-1.5.5 with options: -n 2 -m -u -c 95 -x -r 1000 -f A -p 0.5 -t 0. |
| Experiment Setup | Yes | We set the dimension of RNN (kh) to 8, which is a half of the input layer dimension. We use mini-batch gradient descent with L2-norm regularization to update weights. The learning rate is 0.005 and regularization factor is 0.1. We set the batch size to 100. The training process with 100 iterations spends about 30 minutes. |