reproducibilityindex.ai

Discriminative Adversarial Search for Abstractive Summarization

Authors: Thomas Scialom, Paul-Alexis Dray, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate the effectiveness of the proposed approach on the task of Abstractive Summarization: the results obtained show that DAS improves over the stateof-the-art methods, with further gains obtained via discriminator retraining.
Researcher Affiliation	Collaboration	Thomas Scialom 1 2 Paul-Alexis Dray 1 Sylvain Lamprier 2 Benjamin Piwowarski 2 3 Jacopo Staiano 1 1reciTAL, Paris, France 2Sorbonne Universit e, CNRS, LIP6, F75005 Paris, France 3CNRS, France.
Pseudocode	Yes	Algorithm 1 DAS: a Beam Search algorithm with the proposed discriminator re-ranking mechanism highligted.
Open Source Code	No	The paper provides links for the UniLM model they build upon and for a dataset split, but not for the source code of their proposed DAS methodology.
Open Datasets	Yes	One of most popular datasets for summarization is the CNN/Daily Mail (CNN/DM) dataset (Hermann et al., 2015; Nallapati et al., 2016). ... Publicly available at https://github.com/microsoft/unilm#abstractive-summarization--cnn--daily-mail ... Furthermore, to assess the possible beneﬁts of the proposed approach in a domain adaptation setup, we conduct experiments on TL;DR, a large scale summarization dataset built from social media data (V olske et al., 2017). ... The training set is composed of around 3M examples and publicly available,2 while the test set is kept hidden because of public ongoing leaderboard evaluation. Hence, we randomly sampled 100k examples for training, 5k for validation and 5k for test. For reproducibility purposes, we make the TL;DR split used in this work publicly available.
Dataset Splits	Yes	For fair comparison, we used the exact same dataset version as previous works (See et al., 2017; Gehrmann et al., 2018; Dong et al., 2019). ... For reproducibility purposes, we make the TL;DR split used in this work publicly available. Hence, we randomly sampled 100k examples for training, 5k for validation and 5k for test.
Hardware Specification	Yes	For all our experiments we used a single RTX 2080 Ti GPU.
Software Dependencies	No	The paper mentions that models are implemented in PyText and uses Adam optimizer for BERT, but does not specify version numbers for these or other software libraries.
Experiment Setup	Yes	To train the discriminator, we used the Adam optimiser with the recommended parameters for BERT: learning rate of 3e 5, batch size of 4 and accumulated batch size of 32. We trained it for 5 epochs; each epoch took 100 minutes on 150k samples.