ColdGANs: Taming Language GANs with Cautious Sampling Strategies
Authors: Thomas Scialom, Paul-Alexis Dray, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For the first time, to the best of our knowledge, the proposed language GANs compare favorably to MLE, and obtain improvements over the state-of-the-art on three generative tasks, namely unconditional text generation, question generation, and abstractive summarization. and Finally, we apply our proposed methods on three tasks. We report positive results compared to previous works, including GANs and MLE-based models. |
| Researcher Affiliation | Collaboration | CNRS, France Sorbonne Université, CNRS, LIP6, F-75005 Paris, France reci TAL, Paris, France {thomas,paul-alexis,jacopo}@recital.ai {sylvain.lamprier,benjamin.piwowarski}@lip6.fr |
| Pseudocode | No | The paper describes methods and equations, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the methodology described. |
| Open Datasets | Yes | Following [6, 20, 38, 14, 4, 22], we used the EMNLP2017 news dataset. and Following previous works [8, 9], we used the SQu AD dataset [32]. and We used the popular CNN/DM dataset [25], a corpus containing news articles and the corresponding abstractive summaries. |
| Dataset Splits | Yes | For all our experiments, we used the validation sets for hyperparameter selection. |
| Hardware Specification | No | The paper mentions using T5-small (60M parameters) and BART, but does not specify the hardware (e.g., GPU models, CPU types) used for running the experiments. |
| Software Dependencies | No | The paper mentions models like T5 and BART, but does not specify software dependencies with version numbers (e.g., specific Python, PyTorch, or TensorFlow versions). |
| Experiment Setup | Yes | In more detail, we evaluated our approach with several learning rates, reporting results for a value of 2e-5. and The best performance is achieved with the experiment emphasizing the Cold GANnucleus exploration the most, with ϵ = .9 and T = .2. and In our experiments, we set c = 5. and replace 1% of the discriminator training examples with samples from the buffer. |