reproducibilityindex.ai

PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model

Authors: Yizhe Zhang, Jiatao Gu, Zhuofeng Wu, Shuangfei Zhai, Joshua Susskind, Navdeep Jaitly

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed method is evaluated on various conditional generation tasks, and results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text in an efﬁcient manner.
Researcher Affiliation	Industry	Yizhe Zhang, Jiatao Gu, Zhuofeng Wu, Shuangfei Zhai, Josh Susskind, Navdeep Jaitly Apple {yizzhang, jgu32, zhuofeng_wu, szhai, jsusskind, njaitly}@apple.com
Pseudocode	No	No pseudocode or algorithm blocks found.
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets	Yes	For the Sentiment-guided generation task, we used the Trip Advisor dataset provided by (Li et al., 2014). For the text completion task, our model was assessed on two datasets: 1) the aforementioned Trip Advisor review dataset... and 2) one-tenth of the overall C4 datasets (Raffel et al., 2020)... For the summarization task, we use CNN/Daily Mail (Hermann et al., 2015) and XSum (Narayan et al., 2018).
Dataset Splits	Yes	The datasets were partitioned into training, validation, and test in the ratios of (0.96,0.02,0.02).
Hardware Specification	Yes	We conducted inference time benchmarks of each method on a single Nvidia A100.
Software Dependencies	No	The paper mentions software components like BERT-large, GPT-medium, T5-large, and PyTorch, but does not provide specific version numbers for these or other dependencies.
Experiment Setup	Yes	The embedding dimension h was 1024, and the number of paragraph embeddings k was set to 16, as increasing the number did not result in signiﬁcant improvement in performance. We provide more analysis on the impact of k in App. A.2 The learning rate was set to 2e-4, and β was set to 5e-6. For the latent diffusion model, the channel size was set to 1024 to match the embedding dimension h, and the number of heads was set to 16 with 28 transformer layers. The total size of the latent diffusion model was 533M. The feature encoder was also jointly learned, and was initialized with a T5-large encoder. We use DDIM throughout our experiments as it shows better performance than DDPM. In all our experiments, we use 30 diffusion steps to generate the ﬁnal z , which strikes a good balance among the efﬁciency, diversity and relevance. In comparison, Diff-LM (Li et al., 2022) and Genie (Lin et al., 2022) report to use 200 steps and 2000 steps respectively to generate high-quality text. We set the CFG weights to be 2 and 5 for text completion and summarization tasks, respectively, based on generation performance on validation set.