Panacea: Pareto Alignment via Preference Adaptation for LLMs
Authors: Yifan Zhong, Chengdong Ma, Xiaoyuan Zhang, Ziran Yang, Haojun Chen, Qingfu Zhang, Siyuan Qi, Yaodong Yang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we assess the effectiveness and scalability of Panacea on several significant and challenging preference alignment problems with up to 10 dimensions, where the Pareto set cardinality grows exponentially with the number of dimensions, considerably surpassing the scope of current research. |
| Researcher Affiliation | Academia | 1Institute for Artificial Intelligence, Peking University. 2State Key Laboratory of General Artificial Intelligence. 3Department of Computer Science, City University of Hong Kong. 4Yuanpei College, Peking University. |
| Pseudocode | Yes | D Pseudocode of Panacea Algorithm 1 Panacea 1: Input: Rank k, preference dim m, dataset D, iterations T, initial model πinit (, optionally reward model ri for each preference dimension i). 2: Output: Trained policy πθ. 3: Initialize πθ by initializing SVD-Lo RA upon πinit based on k and m. 4: for t in 1 . . . T do 5: Sample from D a data batch B. 6: Sample a preference vector λ and embed into πθ,λ. 7: Compute the aggregated objective for πθ,λ on B according to λ. 8: Update θ with gradient descent. 9: end for 10: Return πθ. |
| Open Source Code | No | While our implementation is based on the open-source Safe-RLHF codebase, the paper does not contain an explicit statement that the specific code for Panacea is being released, nor does it provide a direct link to a repository for Panacea's code. |
| Open Datasets | Yes | Beaver Tails [28] License: Creative Commons Attribution Non Commercial 4.0 URL: https://huggingface.co/datasets/PKU-Alignment/PKU-Safe RLHF |
| Dataset Splits | No | The paper describes the training dataset size and batch sizes, but it does not explicitly specify the proportions or sizes of training, validation, and test splits for the datasets used in the experiments. |
| Hardware Specification | Yes | All our experiments are conducted on an 8 A800-80GB GPU server. |
| Software Dependencies | No | The paper states that its implementation is based on the Safe-RLHF codebase and lists other models used, but it does not provide specific version numbers for key software components or libraries (e.g., Python, PyTorch, CUDA, or specific versions of the Safe-RLHF codebase). |
| Experiment Setup | Yes | In this part, we present details about the experiment setup. In Table 2, Table 4, and Table 5 we provide the common hyperparameters for Panacea with RLHF, DPO, and SFT. |