MultiSumm: Towards a Unified Model for Multi-Lingual Abstractive Summarization
Authors: Yue Cao, Xiaojun Wan, Jinge Yao, Dian Yu45248
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on summarization datasets for five rich-resource languages: English, Chinese, French, Spanish, and German, as well as two low-resource languages: Bosnian and Croatian. Experimental results show that our proposed model significantly outperforms a multi-lingual baseline model. |
| Researcher Affiliation | Collaboration | Yue Cao,1,2,3 Xiaojun Wan,1,2,3 Jin-ge Yao,1 Dian Yu4 1Wangxuan Institute of Computer Technology, Peking University 2Center for Data Science, Peking University 3The MOE Key Laboratory of Computational Linguistics, Peking University 4Tencent AI Lab {yuecao, wanxiaojun, yaojinge}@pku.edu.cn, yudian@tencent.com |
| Pseudocode | Yes | Algorithm 1 Multi-Lingual Training Algorithm for Abstractive Text Summarization |
| Open Source Code | Yes | 1https://github.com/ycao1996/Multi-Lingual-Summarization |
| Open Datasets | Yes | We use the Europarl-v5 dataset (KOEHN 2005) for English, German, Spanish, and French. ... We use the News-Commentary-v13 dataset (Tiedemann 2012) for Chinese... We use the SETIMES dataset (Tiedemann 2012) for Bosnian and Croatian... We use the Gigaword dataset for English, French, and Spanish summarization (Graff et al. 2003; Mendonc a, Graff, and Di Persio 2009a; 2009b). ... We use the LCSTS dataset (Hu, Chen, and Zhu 2015) for Chinese summarization. ... We use the SWISS dataset 3 for German summarization. ... As there is no existing summarization dataset for the low-resource languages Bosnian and Croatian, we first build a new summarization dataset for the two languages. |
| Dataset Splits | Yes | We use the officially divided training sets, validation sets, and test sets. ... we use part I as the training set, part II as the validation set, and samples with 3,4,5 scores in part III as the test set. The number of training pairs, validation pairs, and test pairs are 2,400,591, 10,666, and 725, respectively. ... We randomly split 80% of the samples as the training set, 10% of the samples as the validation set, and 10% of the samples as the test set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Fairseq toolkit' and 'subword-nmt toolkit' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | For transformer architectures, the model hidden size, feed-forward hidden size, the number of layers, and the number of heads are 512, 2,048, 6, and 8, respectively. ... the batch size is set to 4,000 for multi-lingual models and 1,000 for individual models. ... We use warm-up learning rate (Goyal et al. 2017) for the first 4,000 steps, and the initial warm-up learning rate is set to 1e-7. We use the dropout technique and set the dropout rate to 0.2. We use beam search for inference, and the beam size is set to 5 according to the results on the validation set. |