Preserve Context Information for Extract-Generate Long-Input Summarization Framework

Authors: Ruifeng Yuan, Zili Wang, Ziqiang Cao, Wenjie Li

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on both long-document and long-dialogue summarization datasets: ar Xiv and QMSum. The experiment results show that CAEG achieves the-state-of-art result on QMSum and outperforms other extract-generate based models in ar Xiv.
Researcher Affiliation Collaboration 1 The Hong Kong Polytechnic University 2 Xiaohongshu Inc 3 Institute of Artificial Intelligence, Soochow University, China
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper mentions that 'The implementation of our code is based on transformers from Hugging Face,' but it does not provide any link or explicit statement that their own code for CAEG is open-source or available.
Open Datasets Yes ar Xiv(Cohan et al. 2018) is a dataset for long-input single-document summarization... QMSum(Zhong et al. 2021)is a benchmark for query-focused dialogue summarization.
Dataset Splits No The paper states 'We conduct the validation for every 100 steps and train the model for a maximum of 20000 steps,' but it does not provide specific details on the dataset splits (e.g., percentages or counts) for training, validation, and testing.
Hardware Specification Yes The experiments are run on a single V100 GPU.
Software Dependencies No The paper states 'The implementation of our code is based on transformers from Hugging Face,' but it does not provide specific version numbers for this or any other software component.
Experiment Setup Yes Adam algorithm is used for optimization and the learning rate is set to 2e-6. The batch size for the training is set to 1, and gradient accumulation steps are set to 8. We conduct the validation for every 100 steps and train the model for a maximum of 20000 steps.