ColdGANs: Taming Language GANs with Cautious Sampling Strategies

Authors: Thomas Scialom, Paul-Alexis Dray, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For the first time, to the best of our knowledge, the proposed language GANs compare favorably to MLE, and obtain improvements over the state-of-the-art on three generative tasks, namely unconditional text generation, question generation, and abstractive summarization. and Finally, we apply our proposed methods on three tasks. We report positive results compared to previous works, including GANs and MLE-based models.
Researcher Affiliation Collaboration CNRS, France Sorbonne Université, CNRS, LIP6, F-75005 Paris, France reci TAL, Paris, France {thomas,paul-alexis,jacopo}@recital.ai {sylvain.lamprier,benjamin.piwowarski}@lip6.fr
Pseudocode No The paper describes methods and equations, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link for the open-source code of the methodology described.
Open Datasets Yes Following [6, 20, 38, 14, 4, 22], we used the EMNLP2017 news dataset. and Following previous works [8, 9], we used the SQu AD dataset [32]. and We used the popular CNN/DM dataset [25], a corpus containing news articles and the corresponding abstractive summaries.
Dataset Splits Yes For all our experiments, we used the validation sets for hyperparameter selection.
Hardware Specification No The paper mentions using T5-small (60M parameters) and BART, but does not specify the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies No The paper mentions models like T5 and BART, but does not specify software dependencies with version numbers (e.g., specific Python, PyTorch, or TensorFlow versions).
Experiment Setup Yes In more detail, we evaluated our approach with several learning rates, reporting results for a value of 2e-5. and The best performance is achieved with the experiment emphasizing the Cold GANnucleus exploration the most, with ϵ = .9 and T = .2. and In our experiments, we set c = 5. and replace 1% of the discriminator training examples with samples from the buffer.