reproducibilityindex.ai

CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling

Authors: Ning Miao, Hao Zhou, Lili Mou, Rui Yan, Lei Li6834-6842

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on a variety of tasks, including keywords-to-sentence generation, unsupervised sentence paraphrasing, and unsupervised sentence error correction. CGMH achieves high performance compared with previous supervised methods for sentence generation.
Researcher Affiliation	Collaboration	Ning Miao,1 Hao Zhou,2 Lili Mou,3 Rui Yan,1 Lei Li2 1Institute of Computer Science and Technology, Peking University, China 2Byte Dance AI Lab, Beijing, China 3Adept Mind Research, Toronto, Canada
Pseudocode	No	No structured pseudocode or algorithm blocks were found. The method is described in prose and mathematical equations within the "Approach" section.
Open Source Code	Yes	Our code is released at https://github.com/Ning Miao/CGMH
Open Datasets	Yes	For keywords-to-sentence generation, we trained our language model on randomly chosen 5M sentences from the One-Billion-Word Corpus (Chelba et al. 2013).1... used a standard benchmark, the Quora dataset,3 to evaluate each model... evaluated our method on JFLEG (Napoles, Sakaguchi, and Tetreault 2017),4 a newly released dataset for sentence correction.
Dataset Splits	Yes	We followed the standard dataset split, which holds out 3k and 30k for validation and testing, respectively. (Quora dataset)...It contains 1501 sentences (754 for validation and 747 for test) (JFLEG dataset)
Hardware Specification	No	No specific hardware details (e.g., GPU models, CPU types, memory amounts) used for running the experiments are provided in the paper. It only describes the model architecture (e.g., "a two-layer LSTM").
Software Dependencies	No	The paper mentions using the "en package" for sentence error correction: "To better handle typos and tense errors, we employ en package5 to provide an additional candidate word set containing possible words with similar spellings or the same root." However, no specific version number for this package or any other software library is provided.
Experiment Setup	Yes	Our language models are simply a two-layer LSTM with a hidden size of 300. We chose 50k most frequent words as the dictionary. For MH sampling, we used the sequence of keywords as the initial state, and chose the sentence with the lowest perplexity after 100 steps as the output. We set the maximum sampling step to 200.