Meta-CoTGAN: A Meta Cooperative Training Paradigm for Improving Adversarial Text Generation
Authors: Haiyan Yin, Dingcheng Li, Xu Li, Ping Li9466-9473
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the experiment, we demonstrate our proposed approach can efficiently slow down the pace of mode collapse for the adversarial text generators. Overall, our proposed method is able to outperform the baseline approaches with significant margins in terms of both generation quality and diversity in the testified domains. |
| Researcher Affiliation | Industry | Haiyan Yin, Dingcheng Li, Xu Li, Ping Li Cognitive Computing Lab Baidu Research No.10 Xibeiwang East Road, Beijing, 10085, China 10900 NE 8th ST. Bellevue, WA 98004, USA {haiyanyin18, pingli98}@gmail.com, {lidingcheng, lixu13}@baidu.com |
| Pseudocode | Yes | Algorithm 1 Meta Cooperative Training |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Our first evaluation domain is the synthetic oracle dataset, which is first proposed in (Yu et al. 2017). Our second evaluation domain is the COCO Image Captions dataset. We follow the pre-processing method proposed in (Zhu et al. 2018)... Our third evaluation domain is the EMNLP2017 WMT News dataset. The size of this dataset is much larger than Image COCO, involving a training set of 270,000 sentences. The testing set consists of 10,000 sentences. The sentences have maximum length of 51. The vocabulary size is 5,255. |
| Dataset Splits | No | The paper specifies the sizes of training and testing sets for COCO Image Captions and EMNLP2017 WMT News datasets, but it does not explicitly mention a distinct validation set split (e.g., in percentages or specific sample counts) or refer to a standard split that includes a validation set for reproducibility. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU specifications, or memory. |
| Software Dependencies | No | The paper mentions using 'Adam (Kingma and Ba 2015) as the optimization algorithm' but does not specify versions for any software libraries, frameworks (e.g., TensorFlow, PyTorch), or programming languages used for implementation. |
| Experiment Setup | Yes | The relational memory adopts 1 memory slot, multi-head attention with 2 heads, and the attention key size is set to be 512. The language model for cooperative training adopts the identical network architecture as the generator, and the weights for the generator s parameters are assigned to the language model after pretraining. The discriminator adopts multiple representations with size to be 64. We adopt Adam (Kingma and Ba 2015) as the optimization algorithm for updating all the model parameters. During evaluation, we follow the temperature settings proposed in Rel GAN and present the results for our method when evaluated with temperature values of 100 and 1000, respectively. |