Differentiated Distribution Recovery for Neural Text Generation

Authors: Jianing Li, Yanyan Lan, Jiafeng Guo, Jun Xu, Xueqi Cheng6682-6689

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on synthetic data and two public text datasets show that our DDR method achieves more flexible quality-diversity trade-off and higher Turing Test pass rate, as compared with baseline methods including RNNLM, Seq GAN and Leak GAN.
Researcher Affiliation Academia CAS Key Laboratory of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China Department of Statistics, University of California, Berkeley
Pseudocode No The paper includes illustrations of architecture (Fig. 1) and a function (Fig. 3) and a theorem with a proof, but no explicit pseudocode block or algorithm steps are presented.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We conduct experiments on both synthetic data, MSCOCO Image Caption dataset (Chen et al. 2015) and EMNLP2017 WMT News dataset1. 1http://statmt.org/wmt17/translation-task.html
Dataset Splits No The paper specifies training and test set sizes for MSCOCO and WMT datasets, but does not explicitly mention a validation set split. It evaluates on generated samples and test sets.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU/GPU models, memory specifications).
Software Dependencies No The paper mentions that "All models are trained using the Adam optimizer (Kingma and Ba 2014)", but does not provide version numbers for Adam or any other software dependencies like programming languages or libraries.
Experiment Setup Yes Embedding dimensions and number of LSTM hidden nodes are all set to 32 on synthetic data and 128 on other two datasets. All models are trained using the Adam optimizer (Kingma and Ba 2014). Similar to Seq GAN and Leak GAN, we also pre-train our model using MLE before applying DDR.