Improving Multi-Document Summarization via Text Classification
Authors: Ziqiang Cao, Wenjie Li, Sujian Li, Furu Wei
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on DUC generic multidocument summarization datasets show that, TCSum can achieve the state-of-the-art performance without using any hand-crafted features and has the capability to catch the variations of summary styles with respect to different text categories. |
| Researcher Affiliation | Collaboration | 1Department of Computing, The Hong Kong Polytechnic University, Hong Kong 2Hong Kong Polytechnic University Shenzhen Research Institute, China 3Key Laboratory of Computational Linguistics, Peking University, MOE, China 4Microsoft Research, Beijing, China |
| Pseudocode | No | The paper describes the model architecture and processes using text and a diagram (Figure 1), but it does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | The most commonly used evaluation corpora for summarization are the ones published by the Document Understanding Conferences (DUC) and Text Analytics Conferences (TAC2). ... The NYT corpus contains over 1.8 million articles published and annotated by the New York Times. ... 3https://catalog.ldc.upenn.edu/LDC2008T19 |
| Dataset Splits | Yes | We conduct three-fold validation. The model is trained on two years data and tested on the remaining year s. ... The cross validation shows that the learned classification model of TCSum achieves over 85% accuracy on this dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | Yes | For evaluation, we use ROUGE4 (Lin 2004), which has been regarded as a standard automatic evaluation metric since 2004. ROUGE-1.5.5 with options: -n 2 -m -u -c 95 -x -r 1000 -f A -p 0.5 -t 0. ... we apply the diagonal variant of Ada Grad with mini-batches (Duchi, Hazan, and Singer 2011) to update model parameters. |
| Experiment Setup | Yes | The dimension of word embeddings is set to 50, as in many previous papers (e.g., (Collobert et al. 2011)). We also set the dimension of sentence and document embeddings equivalent the dimension of word embeddings, and the window size h to 2, to be consistent with ROUGE-2 evaluation. We empirically set the margin threshold of pairwise ranking Ω = 0.1. The initial learning rate is 0.1 and batch size is 128. |