TextGAIL: Generative Adversarial Imitation Learning for Text Generation
Authors: Qingyang Wu, Lei Li, Zhou Yu14067-14075
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For evaluation, we conduct experiments on a diverse set of unconditional and conditional text generation tasks. Experimental results show that Text GAIL achieves better performance in terms of both quality and diversity than the MLE baseline. |
| Researcher Affiliation | Collaboration | Qingyang Wu 1, Lei Li 2, Zhou Yu 1, 1University of California, Davis, 2Byte Dance AI Lab, {wilwu, joyu}@ucdavis.edu, lileilab@bytedance.com |
| Pseudocode | Yes | Algorithm 1 Text GAIL |
| Open Source Code | No | The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | For unconditional generation tasks, previous text GANs often only perform experiment on unconditional generation tasks: COCO and EMNLP2017 News. We extend the experiments to conditional generation tasks, as more practical applications. Specifically, we experiment our model on Common GEN and ROCStories. |
| Dataset Splits | No | The paper mentions using a 'part of the training set' for warm-up and describes stopping criteria, but it does not specify explicit training, validation, and test dataset splits (e.g., percentages or exact sample counts) for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'GPT-2 base' and 'Ro BERTabase' models, but it does not specify any general software dependencies like programming language versions (e.g., Python 3.x) or library versions (e.g., TensorFlow 2.x, PyTorch 1.x). |
| Experiment Setup | Yes | The human demonstrations mix ratio p is set to 0.3 at the start of the training and linearly decay afterward. The constant reward for human demonstrations is set to 2.0. ... We perform beam search with a beam size of four on the two conditional generation tasks. ... We observe the model has less repetition and better quality with nucleus sampling with hyper-parameters top-p 0.9 and temperature 0.8. |