Model Alignment as Prospect Theoretic Optimization
Authors: Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, Douwe Kiela
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4.2. Experiments We subject the models to: (1) winrate experiments following 3.3, where for some test inputs GPT-4-0613 is used to judge the aligned model s generation against the SFT target; (2) generative benchmarks such as MMLU (0-shot) (Hendrycks et al., 2021), GSM8K (8-shot, chain-of-thought) (Cobbe et al., 2021), Human Eval (0-shot) (Chen et al., 2021), and Big Bench-Hard (3-shot chain-of-thought) (Srivastava et al., 2022). |
| Researcher Affiliation | Collaboration | 1Stanford University (first author was an intern at Contextual AI) 2Contextual AI. |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Our code is available on Github; models are on Huggingface. |
| Open Datasets | Yes | The models were trained on a combination of Anthropic-HH (Ganguli et al., 2022), Open Assistant (K opf et al., 2023), and SHP (Ethayarajh et al., 2022). |
| Dataset Splits | No | The paper mentions 'test inputs' and 'test data' for evaluation, but it does not specify explicit training/validation/test dataset splits with percentages or counts for its experiments. It refers to standard datasets but not their specific splits for reproduction purposes. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments. It mentions model scales like '1B to 30B' but not hardware specifications. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments (e.g., Python, PyTorch, or other relevant frameworks). |
| Experiment Setup | Yes | All models are aligned under identical settings on the same data (e.g., same effective batch size, same optimizer, etc.), save for hyperparameters unique to them. |