A Reinforced Topic-Aware Convolutional Sequence-to-Sequence Model for Abstractive Text Summarization

Authors: Li Wang, Junlin Yao, Yunzhe Tao, Li Zhong, Wei Liu, Qiang Du

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We carry out the experimental evaluation with state-of-the-art methods over the Gigaword, DUC-2004, and LCSTS datasets. The empirical results demonstrate the superiority of our proposed method in the abstractive summarization.
Researcher Affiliation Collaboration 1 Tencent Data Center of SNG 2 ETH Z urich 3 Columbia University 4 Tencent AI Lab
Pseudocode No The paper describes the architecture and various components of the model with mathematical formulas but does not provide any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any statement about making its source code publicly available, nor does it provide a link to a code repository.
Open Datasets Yes First, we consider the annotated Gigaword corpus [Graff and Cieri, 2003] preprocessed identically to [Rush et al., 2015], which leads to around 3.8M training samples, 190K validation samples and 1951 test samples for evaluation. We also evaluate various models on the DUC-2004 test set1 [Over et al., 2007]. The last dataset for evaluation is a large corpus of Chinese short text summarization (LCSTS) dataset [Hu et al., 2015] collected and constructed from the Chinese microblogging website Sina Weibo. Following the setting in the original paper, we use the first part of LCSTS dataset for training, which contains 2.4M text-summary pairs, and choose 725 pairs from the last part with high annotation scores as our test set.
Dataset Splits Yes First, we consider the annotated Gigaword corpus [Graff and Cieri, 2003] preprocessed identically to [Rush et al., 2015], which leads to around 3.8M training samples, 190K validation samples and 1951 test samples for evaluation.
Hardware Specification Yes All models are implemented in Py Torch [Paszke et al., 2017] and trained on a single Tesla M40 GPU.
Software Dependencies No The paper mentions 'Py Torch [Paszke et al., 2017]' as the implementation framework but does not provide a specific version number for PyTorch or other software dependencies.
Experiment Setup Yes We employ six convolutional layers for both the encoder and decoder. All embeddings...have a dimensionality of 256. We use a learning rate of 0.25 and reduce it by a decay rate of 0.1...the scaling factor λ is set to be 0.99...Nesterov’s accelerated gradient method [Sutskever et al., 2013] is used for training, with the mini-batch size of 32 and the learning rate of 0.0001.