reproducibilityindex.ai

Position: A Roadmap to Pluralistic Alignment

Authors: Taylor Sorensen, Jared Moore, Jillian Fisher, Mitchell L Gordon, Niloofar Mireshghallah, Christopher Michael Rytting, Andre Ye, Liwei Jiang, Ximing Lu, Nouha Dziri, Tim Althoff, Yejin Choi

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We highlight empirical evidence, both from our own experiments and from other work, that standard alignment procedures might reduce distributional pluralism in models, motivating the need for further research on pluralistic alignment. and As shown in Table 1, almost all pre-aligned models have lower Jensen-Shannon distance to the target human distribution than the post-aligned models for both datasets.
Researcher Affiliation	Collaboration	1Department of Computer Science, University of Washington, Seattle, Washington, USA 2Department of Computer Science, Stanford University, Stanford, California, USA 3Department of Statistics, University of Washington, Seattle, Washington, USA 4Department of Electrical Engineering and Computer Science, MIT, Cambridge, Massachusetts, USA 5Allen Institute for Artificial Intelligence, Seattle, Washington, USA.
Pseudocode	No	The paper defines concepts and discusses implementations but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper provides a link to experimental code ('Code can be found at: https://github.com/jfisher52/AI_Pluralistic_Alignment'), but this is for specific experiments validating a hypothesis, not for the broader conceptual framework or 'methodology' described in the paper regarding pluralistic alignment.
Open Datasets	Yes	We use two diverse multiple choices datasets, the Global Opinion QA (Global QA) dataset which is an aggregation of cross-national surveys designed to capture opinions on global issues (Durmus et al., 2023) and the Machine Personality Inventory (MPI) which is a collection of 120 questions designed to evaluate human personality traits (Jiang et al., 2023).
Dataset Splits	No	The paper describes using existing datasets (Global Opinion QA, MPI) for evaluation by comparing model distributions to human distributions, but it does not specify traditional training, validation, or test splits for these datasets within the context of their own experiments.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU, CPU models, memory) used to conduct the experiments.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers (e.g., Python, PyTorch, or specific library versions) used for the experiments.
Experiment Setup	Yes	To create the model distribution", "we utilized the technique of in-context learning to steer the model to output the letter of the multiple choice answer it wanted to select as the first, next token. In order to remove any bias these in-context examples might implicitly have, we prompted the model with the same prompt a total of 5 times, each time randomly selecting the correct" answer shown in the in-context examples. We then averaged the probabilities over these five distributions.