Data Augmentation for Abstractive Query-Focused Multi-Document Summarization

Authors: Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley, Chenyan Xiong, Yizhe Zhang, Mohit Bansal, Jianfeng Gao13666-13674

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results demonstrate that our data augmentation and encoding methods outperform baseline models on automatic metrics, as well as on human evaluations along multiple attributes. We present empirical results of our proposed models on various datasets.
Researcher Affiliation Collaboration 1UNC Chapel Hill, 2Microsoft Research, Redmond {ram, mbansal}@cs.unc.edu, {aslicel, mgalley, Chenyan.Xiong, yizhe.zhang, jfgao}@microsoft.com
Pseudocode No The paper does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes 1Code: https://github.com/ramakanth-pasunuru/QmdsCnnIr
Open Datasets Yes We use three large datasets for training QMDS models: our two datasets QMDSCNN and QMDSIR, described in Sec. 3.1, and the Wiki Sum. We also use DUC 2006 and DUC 2007 datasets for evaluating our models.6
Dataset Splits Yes Table 1: QMDSCNN and QMDSIR statistics. Statistics Train Val Test QMDSCNN (# samples) 287,113 13,368 11,490
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup No Due to space constraints and no supplementary allowed in AAAI rules, we provide more details in the ar Xiv version. This indicates that specific experimental setup details, such as hyperparameters, are not present in the main paper.