Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Model Alignment as Prospect Theoretic Optimization
Authors: Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, Douwe Kiela
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4.2. Experiments We subject the models to: (1) winrate experiments following 3.3, where for some test inputs GPT-4-0613 is used to judge the aligned model s generation against the SFT target; (2) generative benchmarks such as MMLU (0-shot) (Hendrycks et al., 2021), GSM8K (8-shot, chain-of-thought) (Cobbe et al., 2021), Human Eval (0-shot) (Chen et al., 2021), and Big Bench-Hard (3-shot chain-of-thought) (Srivastava et al., 2022). |
| Researcher Affiliation | Collaboration | 1Stanford University (first author was an intern at Contextual AI) 2Contextual AI. |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Our code is available on Github; models are on Huggingface. |
| Open Datasets | Yes | The models were trained on a combination of Anthropic-HH (Ganguli et al., 2022), Open Assistant (K opf et al., 2023), and SHP (Ethayarajh et al., 2022). |
| Dataset Splits | No | The paper mentions 'test inputs' and 'test data' for evaluation, but it does not specify explicit training/validation/test dataset splits with percentages or counts for its experiments. It refers to standard datasets but not their specific splits for reproduction purposes. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments. It mentions model scales like '1B to 30B' but not hardware specifications. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments (e.g., Python, PyTorch, or other relevant frameworks). |
| Experiment Setup | Yes | All models are aligned under identical settings on the same data (e.g., same effective batch size, same optimizer, etc.), save for hyperparameters unique to them. |