TextGAIL: Generative Adversarial Imitation Learning for Text Generation

Authors: Qingyang Wu, Lei Li, Zhou Yu14067-14075

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For evaluation, we conduct experiments on a diverse set of unconditional and conditional text generation tasks. Experimental results show that Text GAIL achieves better performance in terms of both quality and diversity than the MLE baseline.
Researcher Affiliation Collaboration Qingyang Wu 1, Lei Li 2, Zhou Yu 1, 1University of California, Davis, 2Byte Dance AI Lab, {wilwu, joyu}@ucdavis.edu, lileilab@bytedance.com
Pseudocode Yes Algorithm 1 Text GAIL
Open Source Code No The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes For unconditional generation tasks, previous text GANs often only perform experiment on unconditional generation tasks: COCO and EMNLP2017 News. We extend the experiments to conditional generation tasks, as more practical applications. Specifically, we experiment our model on Common GEN and ROCStories.
Dataset Splits No The paper mentions using a 'part of the training set' for warm-up and describes stopping criteria, but it does not specify explicit training, validation, and test dataset splits (e.g., percentages or exact sample counts) for reproducibility.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory specifications) used for running the experiments.
Software Dependencies No The paper mentions the use of 'GPT-2 base' and 'Ro BERTabase' models, but it does not specify any general software dependencies like programming language versions (e.g., Python 3.x) or library versions (e.g., TensorFlow 2.x, PyTorch 1.x).
Experiment Setup Yes The human demonstrations mix ratio p is set to 0.3 at the start of the training and linearly decay afterward. The constant reward for human demonstrations is set to 2.0. ... We perform beam search with a beam size of four on the two conditional generation tasks. ... We observe the model has less repetition and better quality with nucleus sampling with hyper-parameters top-p 0.9 and temperature 0.8.