reproducibilityindex.ai

SemSUM: Semantic Dependency Guided Neural Abstractive Summarization

Authors: Hanqi Jin, Tianming Wang, Xiaojun Wan8026-8033

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our model on the English Gigaword, DUC 2004 and MSR abstractive sentence summarization datasets. Experiments show that the proposed model improves semantic relevance and reduces content deviation, and also brings signiﬁcant improvements on automatic evaluation ROUGE metrics.
Researcher Affiliation	Academia	Hanqi Jin,1,2,3 Tianming Wang,1,3 Xiaojun Wan1,2,3 1Wangxuan Institute of Computer Technology, Peking University 2Center for Data Science, Peking University 3The MOE Key Laboratory of Computational Linguistics, Peking University {jinhanqi, wangtm, wanxiaojun}@pku.edu.cn
Pseudocode	No	The paper presents architectural diagrams and mathematical formulations but does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is publicly available at https://github.com/zhongxia96/Sem SUM.
Open Datasets	Yes	We experiment with the English Gigaword dataset1 (Napoles, Gormley, and Durme 2012), the DUC2004 dataset (Over, Dang, and Harman 2007) and the MSR-ATC Test Set (Toutanova et al. 2016). The Gigaword dataset contains about 3.8M sentence-summary pairs for training and 189K pairs for development. For test, we use the standard test set of 1951 sentence-summary pairs. ... 1All the training, validation and test dataset can be downloaded at https://github.com/harvardnlp/sent-summary.
Dataset Splits	Yes	The Gigaword dataset contains about 3.8M sentence-summary pairs for training and 189K pairs for development. For test, we use the standard test set of 1951 sentence-summary pairs.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU or CPU models used for running the experiments. It lacks information beyond general mentions of computations.
Software Dependencies	No	The paper mentions using the fairseq toolkit but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks (e.g., Python version, PyTorch version).
Experiment Setup	Yes	We set our model parameters based on preliminary experiments on the development set. We prune the vocabulary to 50k and use the word in source sentence with maximum weights in copy attention to replace the unknown word to solve the OOVs problem. We set the dimension of word embeddings and hidden units dmodel to 512, feed-forward units to 2048. We set 4 heads for multi-head graph-attention and 8 heads for multi-head self-attention, masked multi-head selfattention and multi-head cross-attention. We set the number of layers of sentence encoder L1, graph encoder L2, and summary decoder L3 to 4, 3 and 6, respectively. We set dropout rate to 0.1 and use Adam optimizer with an initial learning rate α = 0.0001, momentum β1 = 0.9, β2 = 0.999 and weight decay ϵ = 10 5. The learning rate is halved if the valid loss on the development set increases for two consecutive epochs. We use a mini-batch size of 300. Beam search with beam size of 5 is used for decoding.