Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets
Authors: Irene Solaiman, Christy Dennison
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our process using three metrics: quantitative metrics with human evaluations that score output adherence to a target value, toxicity scoring on outputs; and qualitative metrics analyzing the most common word associated with a given social category. Through each iteration, we add additional training dataset examples based on observed shortcomings from evaluations. PALMS performs significantly better on all metrics compared to baseline and control models for a broad range of GPT-3 language model sizes without compromising capability integrity. We find that the effectiveness of PALMS increases with model size. We show that significantly adjusting language model behavior is feasible with a small, hand-curated dataset. |
| Researcher Affiliation | Collaboration | Irene Solaiman Zillow Group contact@irenesolaiman.com Christy Dennison MIT christy@mit.edu |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | No | The paper instructs to "To reproduce these results, use the OpenAI Fine-Tuning API to fine-tune on the same base models we used in this paper." However, this is a third-party API and does not constitute open-sourcing the authors' specific code for their methodology. |
| Open Datasets | No | The paper states: "The dataset of completions, or values-targeted dataset, consists of N = 80 text answers to the questions in Step 3 with lengths between 40 and 340 words.". It also mentions a "private corpus of books and Wikipedia articles" for the control dataset. No concrete access information (URL, DOI, specific repository, or formal citation with authors/year) is provided for the datasets created or used. |
| Dataset Splits | No | The paper mentions "Validation and Test Sets" but does not provide explicit split percentages or absolute sample counts for each split. |
| Hardware Specification | No | No specific hardware (GPU/CPU models, memory details, or detailed computer specifications) used for running experiments is mentioned in the paper. |
| Software Dependencies | No | The paper mentions using "the OpenAI Fine-Tuning API" and "Perspective API", but does not provide specific version numbers for these APIs or any other software components. |
| Experiment Setup | Yes | See Appendix C for fine-tuning hyperparameters. We then generated three completions per prompt per model with length 200 and temperature 0.7. |