Panacea: Pareto Alignment via Preference Adaptation for LLMs

Authors: Yifan Zhong, Chengdong Ma, Xiaoyuan Zhang, Ziran Yang, Haojun Chen, Qingfu Zhang, Siyuan Qi, Yaodong Yang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, we assess the effectiveness and scalability of Panacea on several significant and challenging preference alignment problems with up to 10 dimensions, where the Pareto set cardinality grows exponentially with the number of dimensions, considerably surpassing the scope of current research.
Researcher Affiliation Academia 1Institute for Artificial Intelligence, Peking University. 2State Key Laboratory of General Artificial Intelligence. 3Department of Computer Science, City University of Hong Kong. 4Yuanpei College, Peking University.
Pseudocode Yes D Pseudocode of Panacea Algorithm 1 Panacea 1: Input: Rank k, preference dim m, dataset D, iterations T, initial model πinit (, optionally reward model ri for each preference dimension i). 2: Output: Trained policy πθ. 3: Initialize πθ by initializing SVD-Lo RA upon πinit based on k and m. 4: for t in 1 . . . T do 5: Sample from D a data batch B. 6: Sample a preference vector λ and embed into πθ,λ. 7: Compute the aggregated objective for πθ,λ on B according to λ. 8: Update θ with gradient descent. 9: end for 10: Return πθ.
Open Source Code No While our implementation is based on the open-source Safe-RLHF codebase, the paper does not contain an explicit statement that the specific code for Panacea is being released, nor does it provide a direct link to a repository for Panacea's code.
Open Datasets Yes Beaver Tails [28] License: Creative Commons Attribution Non Commercial 4.0 URL: https://huggingface.co/datasets/PKU-Alignment/PKU-Safe RLHF
Dataset Splits No The paper describes the training dataset size and batch sizes, but it does not explicitly specify the proportions or sizes of training, validation, and test splits for the datasets used in the experiments.
Hardware Specification Yes All our experiments are conducted on an 8 A800-80GB GPU server.
Software Dependencies No The paper states that its implementation is based on the Safe-RLHF codebase and lists other models used, but it does not provide specific version numbers for key software components or libraries (e.g., Python, PyTorch, CUDA, or specific versions of the Safe-RLHF codebase).
Experiment Setup Yes In this part, we present details about the experiment setup. In Table 2, Table 4, and Table 5 we provide the common hyperparameters for Panacea with RLHF, DPO, and SFT.