reproducibilityindex.ai

Diffusion-LM Improves Controllable Text Generation

Authors: Xiang Li, John Thickstun, Ishaan Gulrajani, Percy S. Liang, Tatsunori B. Hashimoto

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate successful control of Diffusion-LM for six challenging ﬁne-grained control tasks, signiﬁcantly outperforming prior work.We train Diffusion-LM for two language modeling tasks. We then apply the controllable generation method to 5 classiﬁer-guided control tasks, and apply MBR decoding to a classiﬁer-free control task (i.e. inﬁlling). We measure the impact of our proposed design choices through lm-score.
Researcher Affiliation	Academia	Xiang Lisa Li Stanford University xlisali@stanford.edu John Thickstun Stanford University jthickst@stanford.edu Ishaan Gulrajani Stanford Univeristy igul@stanford.edu Percy Liang Stanford Univeristy pliang@cs.stanford.edu Tatsunori B. Hashimoto Stanford Univeristy thashim@stanford.edu
Pseudocode	No	The paper describes the methods and processes in narrative text and with diagrams, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/Xiang Li1999/Diffusion-LM.git
Open Datasets	Yes	We train Diffusion-LM on two datasets: E2E [34] and ROCStories [32].
Dataset Splits	Yes	For each control task (e.g. semantic content), we sample 200 control targets c (e.g., rating=5 star) from the validation splits, and we generate 50 samples for each control target.
Hardware Specification	No	The paper mentions the model architecture and parameter count ('80M parameters') but does not provide any specific details regarding the hardware (e.g., CPU/GPU models, memory) used for experiments.
Software Dependencies	No	The paper mentions models like Transformer and GPT-2, and optimizers like Adagrad, but it does not specify any software libraries or dependencies with version numbers (e.g., 'PyTorch 1.x' or 'TensorFlow 2.x').
Experiment Setup	Yes	Our Diffusion-LM is based on Transformer [52] architecture with 80M parameters, with a sequence length n = 64, diffusion steps T = 2000 and a square-root noise schedule (see Appendix A for details). We treat the embedding dimension as a hyperparameter, setting d = 16 for E2E and d = 128 for ROCStories. See Appendix B for hyperparameter details.