reproducibilityindex.ai

QuerySum: A Multi-Document Query-Focused Summarization Dataset Augmented with Similar Query Clusters

Authors: Yushan Liu, Zili Wang, Ruifeng Yuan

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the performance of Query Sum dataset using the existing summarization models for a better understanding of the dataset.
Researcher Affiliation	Collaboration	Yushan Liu1, Zili Wang2*, Ruifeng Yuan3 1Fudan University 2INF Technology (Shanghai) Co., Ltd. 3The Hong Kong Polytechnic University yushanliu21@m.fudan.edu.cn, ziliwang.do@gmail.com, ruifeng.yuan@connect.polyu.hk
Pseudocode	No	The paper describes the model architecture and its components but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code	No	The paper states, 'Our dataset is available on github2. 2https://github.com/613lys/Query Sum'. This link is for the dataset, not the source code for the proposed model or methodology.
Open Datasets	Yes	We build a new large-scale query-focused multi-document summarization dataset called Query Sum. ... Our dataset is available on github2. 2https://github.com/613lys/Query Sum
Dataset Splits	Yes	Following the previous work, We randomly extract 15% of the data samples as the validation set and another 15% as the test set.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU specifications, or cloud computing instances.
Software Dependencies	No	The paper mentions using PEGASUS-LARGE/BASE and Adam optimizer, but does not provide specific version numbers for software dependencies like programming languages or libraries (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	ALL the hyperparameters are adjusted on the development set. For optimization, the batch size is set to 16. We use dropout with the probability of 0.1 and label smoothing (Szegedy et al. 2015) with smoothing factor 0.1. The optimizer is Adam (Kingma and Ba 2014) with a learning rate of 0.001. In addition, we apply warm-up with the first 10% steps, and learning rate decay of 0.95.