Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets

Authors: Irene Solaiman, Christy Dennison

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our process using three metrics: quantitative metrics with human evaluations that score output adherence to a target value, toxicity scoring on outputs; and qualitative metrics analyzing the most common word associated with a given social category. Through each iteration, we add additional training dataset examples based on observed shortcomings from evaluations. PALMS performs significantly better on all metrics compared to baseline and control models for a broad range of GPT-3 language model sizes without compromising capability integrity. We find that the effectiveness of PALMS increases with model size. We show that significantly adjusting language model behavior is feasible with a small, hand-curated dataset.
Researcher Affiliation Collaboration Irene Solaiman Zillow Group contact@irenesolaiman.com Christy Dennison MIT christy@mit.edu
Pseudocode No No pseudocode or algorithm blocks were found.
Open Source Code No The paper instructs to "To reproduce these results, use the OpenAI Fine-Tuning API to fine-tune on the same base models we used in this paper." However, this is a third-party API and does not constitute open-sourcing the authors' specific code for their methodology.
Open Datasets No The paper states: "The dataset of completions, or values-targeted dataset, consists of N = 80 text answers to the questions in Step 3 with lengths between 40 and 340 words.". It also mentions a "private corpus of books and Wikipedia articles" for the control dataset. No concrete access information (URL, DOI, specific repository, or formal citation with authors/year) is provided for the datasets created or used.
Dataset Splits No The paper mentions "Validation and Test Sets" but does not provide explicit split percentages or absolute sample counts for each split.
Hardware Specification No No specific hardware (GPU/CPU models, memory details, or detailed computer specifications) used for running experiments is mentioned in the paper.
Software Dependencies No The paper mentions using "the OpenAI Fine-Tuning API" and "Perspective API", but does not provide specific version numbers for these APIs or any other software components.
Experiment Setup Yes See Appendix C for fine-tuning hyperparameters. We then generated three completions per prompt per model with length 200 and temperature 0.7.