reproducibilityindex.ai

Language modeling via stochastic processes

Authors: Rose E Wang, Esin Durmus, Noah Goodman, Tatsunori Hashimoto

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now evaluate the ability of Time Control to capture text dynamics. Specifically, we aim to answer the following research questions (RQ):... We run Time Control with different latent dimensions (d = 8, 16, 32).
Researcher Affiliation	Academia	Rose E. Wang, Esin Durmus, Noah Goodman, Tatsunori B. Hashimoto Stanford University {rewang, edurmus, ngoodman,thashim}@stanford.edu
Pseudocode	No	The paper describes its methods in prose but does not include explicit pseudocode or algorithm blocks.
Open Source Code	Yes	1The accompanying code can be found here: https://github.com/rosewang2008/language_ modeling_via_stochastic_processes.
Open Datasets	Yes	Datasets We use language datasets that elicit different kinds of structure, from section structure to discourse structure to narrative structure. Time Control does not take in any information about the structure, treating each domain the same under its encoding objective. More information and dataset examples are provided in Appendix E. Wikisection (Arnold et al., 2019)... Wikihow (WH) (Koupaee & Wang, 2018)... Recipe NLG (Bie n et al., 2020)... Taskmaster-2 (TM-2) (Byrne et al., 2019)... Ticket Talk (Byrne et al., 2021)... ROC Stories (Mostafazadeh et al., 2016)...
Dataset Splits	Yes	We fine-tune for 10 epochs and checkpoint the models every 1000 steps; we keep the model checkpoint that scores the lowest PPL on a held-out validation set.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU model, CPU type) used for running experiments.
Software Dependencies	No	The paper mentions using a 'frozen, pretrained GPT2 model from Huggingface' but does not specify version numbers for GPT2, Huggingface libraries, or any other software dependencies.
Experiment Setup	Yes	The MLP network has intermediate Re LU activations and is trained with stochastic gradient descent with a learning rate of 1e-4 and with momentum 0.9.