reproducibilityindex.ai

Unsupervised Text Generation by Learning from Search

Authors: Jingjing Li, Zichao Li, Lili Mou, Xin Jiang, Michael Lyu, Irwin King

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of TGLS on two real-world natural language generation tasks, unsupervised paraphrasing and text formalization. Our model significantly outperforms unsupervised baseline methods in both tasks.
Researcher Affiliation	Collaboration	1The Chinese University of Hong Kong 2Huawei Noah s Ark Lab 3University of Alberta; Alberta Machine Intelligence Institute (Amii)
Pseudocode	Yes	Algorithm 1: Training TGLS
Open Source Code	Yes	1Code is available at https://github.com/jingjingli01/TGLS
Open Datasets	Yes	we conducted experiments on the Quora benchmark dataset.3
Dataset Splits	Yes	For validation and testing, we had 500 and 170K samples, respectively.
Hardware Specification	Yes	The experiments were conducted on a cluster with Nvidia Telsa V100 GPUs.
Software Dependencies	No	The paper mentions software components like GPT2 and RoBERTa, but does not specify version numbers for these or other libraries and programming languages used to implement the experiments.
Experiment Setup	Yes	For SA, the initial temperature was set to 1e-2 in both tasks. The total search steps and temperature cooling were 50, 2e-4 for paraphrasing; and 100 and 1e-4 for text simpliﬁcation. The scorers weights were tuned by grid search, set as ( , β, γ, δ) = (0.8, 1, 0.6, 0.125) for paraphrasing, and (0.8, 2, 1.25, 0.26) for text formalization. We keep the Ro BERTa ﬁxed and further tune the GPT2 model by alternations of search and learning for another 6 epochs.