RAIN: Your Language Models Can Align Themselves without Finetuning

Authors: Yuhui Li, Fangyun Wei, Jinjing Zhao, Chao Zhang, Hongyang Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results evaluated by GPT-4 and humans demonstrate the effectiveness of RAIN
Researcher Affiliation Collaboration Yuhui Li1 , Fangyun Wei2, Jinjing Zhao3, Chao Zhang1, Hongyang Zhang4 1Peking University, 2Microsoft Research, 3The University of Sydney, 4University of Waterloo
Pseudocode Yes The pseudo-code of RAIN is shown in Algorithm 1.
Open Source Code Yes The code is available at https://github.com/SafeAILab/RAIN.
Open Datasets Yes For the harm-free generation task, we employ the Anthropic s Helpful and Harmless (HH) dataset (Bai et al., 2022a). For the truthful generation task, we employ the Truthful QA dataset (Lin et al., 2022), aiming to generate factually grounded, truthful responses. For controlled sentiment generation task, we employ the IMDB dataset (Maas et al., 2011)
Dataset Splits No The paper mentions various datasets but does not provide specific training, validation, or test dataset splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes In all the experiments of this paper, the hyper-parameter c is set to 2, and γ is set to 0.2. For all tasks except truthfulness, we set Tm = 2 and V = 0.7, whereas for truthfulness, due to its increased complexity, we used Tm = 16 and V = 0.8. Across all tasks, the upper limit of the inner loop iterations, represented by T, was fixed at 50.