Discriminative Adversarial Search for Abstractive Summarization
Authors: Thomas Scialom, Paul-Alexis Dray, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We investigate the effectiveness of the proposed approach on the task of Abstractive Summarization: the results obtained show that DAS improves over the stateof-the-art methods, with further gains obtained via discriminator retraining. |
| Researcher Affiliation | Collaboration | Thomas Scialom 1 2 Paul-Alexis Dray 1 Sylvain Lamprier 2 Benjamin Piwowarski 2 3 Jacopo Staiano 1 1reciTAL, Paris, France 2Sorbonne Universit e, CNRS, LIP6, F75005 Paris, France 3CNRS, France. |
| Pseudocode | Yes | Algorithm 1 DAS: a Beam Search algorithm with the proposed discriminator re-ranking mechanism highligted. |
| Open Source Code | No | The paper provides links for the UniLM model they build upon and for a dataset split, but not for the source code of their proposed DAS methodology. |
| Open Datasets | Yes | One of most popular datasets for summarization is the CNN/Daily Mail (CNN/DM) dataset (Hermann et al., 2015; Nallapati et al., 2016). ... Publicly available at https://github.com/microsoft/unilm#abstractive-summarization--cnn--daily-mail ... Furthermore, to assess the possible beneļ¬ts of the proposed approach in a domain adaptation setup, we conduct experiments on TL;DR, a large scale summarization dataset built from social media data (V olske et al., 2017). ... The training set is composed of around 3M examples and publicly available,2 while the test set is kept hidden because of public ongoing leaderboard evaluation. Hence, we randomly sampled 100k examples for training, 5k for validation and 5k for test. For reproducibility purposes, we make the TL;DR split used in this work publicly available. |
| Dataset Splits | Yes | For fair comparison, we used the exact same dataset version as previous works (See et al., 2017; Gehrmann et al., 2018; Dong et al., 2019). ... For reproducibility purposes, we make the TL;DR split used in this work publicly available. Hence, we randomly sampled 100k examples for training, 5k for validation and 5k for test. |
| Hardware Specification | Yes | For all our experiments we used a single RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions that models are implemented in PyText and uses Adam optimizer for BERT, but does not specify version numbers for these or other software libraries. |
| Experiment Setup | Yes | To train the discriminator, we used the Adam optimiser with the recommended parameters for BERT: learning rate of 3e 5, batch size of 4 and accumulated batch size of 32. We trained it for 5 epochs; each epoch took 100 minutes on 150k samples. |