Hierarchical Heterogeneous Graph Attention Network for Syntax-Aware Summarization

Authors: Zixing Song, Irwin King11340-11348

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our model is effective for both the abstractive and extractive summarization tasks on five benchmark datasets from various domains.
Researcher Affiliation Academia Zixing Song, Irwin King The Chinese University of Hong Kong {zxsong, king}@cse.cuhk.edu.hk
Pseudocode No The paper does not include a figure, block, or section labeled "Pseudocode" or "Algorithm", nor structured steps formatted like pseudocode.
Open Source Code No The paper provides a link to a constituency parser (https://github.com/Khalil Mrini/LAL-Parser) which is a tool used in their work, but there is no explicit statement or link indicating that the source code for their proposed model (Synap Sum) is available.
Open Datasets Yes We choose five datasets to evaluate our model. The data split is described in Table 2. CNN/DM (Hermann et al. 2015; See, Liu, and Manning 2017)... New York Times (NYT) (Sandhaus 2008)... Reddit (Kim, Kim, and Kim 2019)... Wiki How (Koupaee and Wang 2018)... Pub Med dataset... Table 2: Dataset Split Avg. Len #Ext Train Valid Test Doc. Sum. CNN/DM 287K 13K 11K 766.1 58.2 3
Dataset Splits Yes We choose five datasets to evaluate our model. The data split is described in Table 2. CNN/DM (Hermann et al. 2015; See, Liu, and Manning 2017)... New York Times (NYT) (Sandhaus 2008)... Reddit (Kim, Kim, and Kim 2019)... Wiki How (Koupaee and Wang 2018)... Pub Med dataset... Table 2: Dataset Split Avg. Len #Ext Train Valid Test Doc. Sum. CNN/DM 287K 13K 11K 766.1 58.2 3
Hardware Specification No The paper does not specify the hardware used to run the experiments, such as specific GPU or CPU models.
Software Dependencies No The paper mentions using "Adam optimizer" and a "state-of-the-art constituency parser", but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We choose the Adam optimizer with an initial learning rate 0.0001, momentum values β1 = 0.9, β2 = 0.999 and weight decay ϵ = 10 5. We feed the graph into our model in a mini-batch fashion with a size of 256. In addition, during the decoding step, a beam search strategy is utilized with the beam size of 3.