reproducibilityindex.ai

Pretraining Language Models with Human Preferences

Authors: Tomasz Korbak, Kejian Shi, Angelica Chen, Rasika Vinayak Bhalerao, Christopher Buckley, Jason Phang, Samuel R. Bowman, Ethan Perez

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We benchmark five objectives for pretraining with human feedback across three tasks and study how they affect the trade-off between alignment and capabilities of pretrained LMs. We find a Paretooptimal and simple approach among those we explored: conditional training, or learning distribution over tokens conditional on their human preference scores given by a reward model.
Researcher Affiliation	Collaboration	1University of Sussex 2New York University 3FAR AI 4Northeastern University 5Anthropic.
Pseudocode	No	The paper describes the different pretraining objectives mathematically but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	The code and datasets accompanying the paper are available at github.com/tomekkorbak/pretraining-with-human-feedback
Open Datasets	Yes	For toxicity and PII, we prepared training data by subsampling 1.95M documents (totaling 3.32B tokens) from the Pile (Gao et al., 2020). For code generation, we subsampled 1.5M Python files (again totaling 3.32B tokens) from a cleaned and filtered version of the Git Hub dataset from Google Big Query released by Tunstall et al. (2022).
Dataset Splits	Yes	We sweep hyperparameters for each GLUE task based on toxicity MLE-pretrained LM s dev set scores. ... We train each LM for each GLUE task for a maximum of 6 epochs with early stopping based on dev scores.
Hardware Specification	No	The paper mentions running experiments and references the compute-optimal scaling laws, but it does not specify any particular hardware components like GPU or CPU models, or cloud computing instance types.
Software Dependencies	No	The paper mentions various software tools used (e.g., Detoxify, SpaCy, Scrubadub, pycodestyle) but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	We keep the original hyperparameters of gpt2-small except for learning rate and batch size, which we tune for each task-objective pair based on train loss. If an objective has it own hyperparameters (e.g. t, α or β), we tune learning rate and batch size separately for each (t, α, β) configuration considered and then chose the best (t, α, β) configuration based on misalignment score of LM samples and the KL divergence from GPT-3 ( 4.1). See Appendix B for hyperparameters used in experiments and ablations on them.