Neural Abstractive Summarization with Structural Attention

Authors: Tanya Chowdhury, Sachin Kumar, Tanmoy Chakraborty

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare the performance of the models on the basis of ROUGE-1, 2 and L F1-scores [Lin, 2004] on the CNN/Dailymail (Table 2), CQA and Multinews (Table 3) datasets.
Researcher Affiliation Academia Tanya Chowdhury1 , Sachin Kumar2 and Tanmoy Chakraborty1 1IIIT-Delhi, India 2Carnegie Mellon University, USA
Pseudocode No The paper describes the model architecture and computations in prose and mathematical equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is public at https: //bit.ly/35i7q93.
Open Datasets Yes (i) The CNN/Dailymail1 dataset [Hermann et al., 2015; Nallapati et al., 2016] (ii) We also use the CQA dataset2 [Chowdhury and Chakraborty, 2019] Additionally, we include analysis on Multi News [Fabbri et al., 2019]
Dataset Splits Yes The scripts released by [Nallapati et al., 2016] are used to extract approximately 250k training pairs, 13k validation pairs and 11.5k test pairs from the corpora. ... We split the 100k dataset into 80k training instances, 10k validation and 10k test instances.
Hardware Specification Yes GPU Ge Force 2080 Ti
Software Dependencies No The paper mentions 'Optimizer Adagrad' but does not provide specific version numbers for software or libraries used.
Experiment Setup Yes Parameter Value Vocabulary size 50,000 Input embedding dim 128 Training decoder steps 100 Learning Rate 0.15 Optimizer Adagrad Adagrad Init accumulator 0.1 Max gradient norm (for clipping) 2.0 Max decoding steps (for BS decoding) 120 Min decoding steps (for BS decoding) 35 Beam search width 4 Weight of coverage loss 1