Preserve Context Information for Extract-Generate Long-Input Summarization Framework
Authors: Ruifeng Yuan, Zili Wang, Ziqiang Cao, Wenjie Li
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on both long-document and long-dialogue summarization datasets: ar Xiv and QMSum. The experiment results show that CAEG achieves the-state-of-art result on QMSum and outperforms other extract-generate based models in ar Xiv. |
| Researcher Affiliation | Collaboration | 1 The Hong Kong Polytechnic University 2 Xiaohongshu Inc 3 Institute of Artificial Intelligence, Soochow University, China |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper mentions that 'The implementation of our code is based on transformers from Hugging Face,' but it does not provide any link or explicit statement that their own code for CAEG is open-source or available. |
| Open Datasets | Yes | ar Xiv(Cohan et al. 2018) is a dataset for long-input single-document summarization... QMSum(Zhong et al. 2021)is a benchmark for query-focused dialogue summarization. |
| Dataset Splits | No | The paper states 'We conduct the validation for every 100 steps and train the model for a maximum of 20000 steps,' but it does not provide specific details on the dataset splits (e.g., percentages or counts) for training, validation, and testing. |
| Hardware Specification | Yes | The experiments are run on a single V100 GPU. |
| Software Dependencies | No | The paper states 'The implementation of our code is based on transformers from Hugging Face,' but it does not provide specific version numbers for this or any other software component. |
| Experiment Setup | Yes | Adam algorithm is used for optimization and the learning rate is set to 2e-6. The batch size for the training is set to 1, and gradient accumulation steps are set to 8. We conduct the validation for every 100 steps and train the model for a maximum of 20000 steps. |