reproducibilityindex.ai

Aligner: Efficient Alignment by Learning to Correct

Authors: Jiaming Ji, Boyuan Chen, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Tianyi (Alex) Qiu, Juntao Dai, Yaodong Yang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate performance improvements by deploying the same Aligner model across 11 different LLMs, evaluated on the 3H dimensions (helpfulness, harmlessness, and honesty). Specifically, Aligner-7B has achieved an average improvement of 68.9% in helpfulness and 22.8% in harmlessness across the tested LLMs while also effectively reducing hallucination.
Researcher Affiliation	Academia	1Institute for AI, Peking University 2State Key Laboratory of General Artificial Intelligence, Institute for AI, Peking University
Pseudocode	Yes	Algorithm 1 Aligner Pseudocode
Open Source Code	Yes	Open Source. We also release all the training codes, Aligner models and 100K Q-A-C dataset, Aligner Tails, to empower the community to explore and advance correction paradigms.
Open Datasets	Yes	We utilize two open-source preference datasets, HH-RLHF [5] and PKUSafe RLHF [19, 20] as our preference datasets. Considering that the preference pairs in PKUSafe RLHF are generated solely by Alpaca-7B, we additionally construct a 50K preference dataset based on these two preference datasets using correction paradigm.
Dataset Splits	No	The paper mentions training, evaluation datasets, and various models, but does not explicitly provide the train/validation/test splits by percentage or absolute counts for reproducibility.
Hardware Specification	Yes	We conducted all training on NVIDIA A800 8 GPUs.
Software Dependencies	No	The paper mentions using Python and Deep Speed ZeRO-3 but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	For detailed training parameters, please see Appendix D. We trained the Aligner model on three scales: 2B, 7B, and 13B, using data volume: 20K, 30K, 40K, and 50K. Throughout the training, we used the Adam W optimizer, setting β1 to 0.9 and β2 to 0.95.