reproducibilityindex.ai

Adversarial Feature Matching for Text Generation

Authors: Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, Lawrence Carin

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show superior performance in quantitative evaluation, and demonstrate that our model can generate realistic-looking sentences.
Researcher Affiliation	Academia	1Duke University, Durham, NC, 27708. Correspondence to: Yizhe Zhang <yizhe.zhang@duke.edu>.
Pseudocode	No	The paper includes model diagrams but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access to source code for the methodology described.
Open Datasets	Yes	Our model is trained using a combination of two datasets: (i) the Book Corpus dataset (Zhu et al., 2015), which consists of 70 million sentences from over 7000 books; and (ii) the Ar Xiv dataset, which consists of 5 million sentences from abstracts of papers from various subjects, obtained from the ar Xiv website.
Dataset Splits	Yes	We randomly choose 0.5 million sentences from Book Corpus and 0.5 million sentences from ar Xiv to construct training and validation sets, i.e., 1 million sentences for each. For testing, we randomly select 25,000 sentences from both corpus, for a total of 50,000 sentences.
Hardware Specification	Yes	All experiments are implemented in Theano (Bastien et al., 2012), using one NVIDIA Ge Force GTX TITAN X GPU with 12GB memory.
Software Dependencies	No	The paper mentions 'Theano' but does not provide a specific version number. No other software components with specific versions are listed.
Experiment Setup	Yes	Provided that the LSTM generator typically involves more parameters and is more difﬁcult to train than the CNN discriminator, we perform one optimization step for the discriminator for every K = 5 steps of the generator. We use a mixture of 5 isotropic Gaussian (RBF) kernels with different bandwidths σ as in Li et al. (2015). Bandwidth parameters are selected to be close to the median distance (in our case around 20) of feature vectors encoded from real sentences. λr and λm are selected based on the performance on the validation set. For the CNN discriminator/encoder, we use ﬁlter windows (h) of sizes {3,4,5} with 300 feature maps each... Gradients are clipped if the norm of the parameter vector exceeds 5 (Sutskever et al., 2014). Adam (Kingma & Ba, 2015) with learning rate 5 10 5 for both discriminator and generator is utilized for optimization. The size of the minibatch is set to 256. We also employed a warm-up training during the ﬁrst two epochs...