Paraphrase Generation with Latent Bag of Words

Authors: Yao Fu, Yansong Feng, John P. Cunningham

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the transparent and effective generation process of this model.1
Researcher Affiliation Academia Yao Fu Department of Computer Science Columbia University yao.fu@columbia.edu Yansong Feng Institute of Computer Science and Technology Peking University fengyansong@pku.edu.cn John P. Cunningham Department of Statistics Columbia University jpc2181@columbia.edu
Pseudocode No The paper does not contain pseudocode or a clearly labeled algorithm block.
Open Source Code Yes Our code can be found at https://github.com/Franx Yao/dgm_latent_bow
Open Datasets Yes Following the settings in previous works [26, 15], we use the Quora6 dataset and the MSCOCO[28] dataset for our experiments. ... For the Quora dataset, there are 50K training instances and 20K testing instances, and the vocabulary size is 8K. For the MSCOCO dataset, there are 94K training instances and 23K testing instances, and the vocabulary size is 11K.
Dataset Splits No The paper only explicitly mentions 'training instances' and 'testing instances' with specific counts, but does not provide details on a validation split.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions LSTMs[18] and Adam [23] as components but does not specify version numbers for any software or libraries.
Experiment Setup Yes We set the maximum sentence length for the two datasets to be 16. ... The Seq2seq-Attn model is trained with 500 state size and 2 stacked LSTM layers. ... Experiments are repeated three times with different random seeds. The average performance is reported. More configuration details are listed in the appendix.