Keywords-Guided Abstractive Sentence Summarization

Authors: Haoran Li, Junnan Zhu, Jiajun Zhang, Chengqing Zong, Xiaodong He8196-8203

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that multi-task learning and keywords-oriented guidance facilitate sentence summarization task, achieving better performance than the competitive models on the English Gigaword sentence summarization dataset.
Researcher Affiliation Collaboration 1JD AI Research 2National Laboratory of Pattern Recognition, Institute of Automation, CAS 3University of Chinese Academy of Sciences 4CAS Center for Excellence in Brain Science and Intelligence Technology
Pseudocode No The paper describes the model architecture and steps in narrative text and formulas, but does not provide an explicitly labeled pseudocode or algorithm block.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We conduct experiments on the English Gigaword dataset, which has about 3.8 million training sentence-summary pairs. We use 8, 000 pairs as the validation set and 2, 000 pairs as the test set, provided by Zhou et al. (2017).
Dataset Splits Yes We use 8, 000 pairs as the validation set and 2, 000 pairs as the test set, provided by Zhou et al. (2017).
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, memory) used for running experiments are provided in the paper.
Software Dependencies No The paper mentions using an 'Adam optimizer' and 'dropout' but does not specify software dependencies with version numbers (e.g., Python version, specific library versions like PyTorch or TensorFlow).
Experiment Setup Yes We set the size of word embedding and LSTM hidden state to 300 and 512, respectively. Adam optimizer is applied with the learning rate of 0.0005, momentum parameters β1 = 0.9 and β1 = 0.999, and ϵ = 10 8. We use dropout (Srivastava et al. 2014) with probability of 0.2 and gradient clipping (Pascanu, Mikolov, and Bengio 2013) with range [ 1, 1]. The mini-batch size is set to 64.