Transfer Q-star : Principled Decoding for LLM Alignment

Authors: Souradip Chakraborty, Soumya Suvra Ghosal, Ming Yin, Dinesh Manocha, Mengdi Wang, Amrit Singh Bedi, Furong Huang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our approach significantly reduces the sub-optimality gap observed in prior So TA methods and demonstrates superior empirical performance across key metrics such as coherence, diversity, and quality in extensive tests on several synthetic and real datasets.
Researcher Affiliation Academia Souradip Chakraborty1 Soumya Suvra Ghosal1 Ming Yin2 Dinesh Manocha1 Mengdi Wang2 Amrit Singh Bedi3 Furong Huang1 1University of Maryland-College Park; 2 Princeton University; 3 University of Central Florida
Pseudocode Yes Algorithm 1 Transfer Q : LLM Alignment via Transfer Decoding
Open Source Code Yes The code is available at https://github.com/umd-huang-lab/Transfer-Q.
Open Datasets Yes Our experimentation is primarily based on the Ultrafeedback [12], Berkeley Nectar [53], and HH-RLHF (Helpful and Harmless) [5] datasets.
Dataset Splits No The paper mentions 'test dataset' and 'test set' in Section 4, but it does not specify explicit training, validation, and test splits (e.g., '80/10/10 split' or specific sample counts for each split). It mentions 'For evaluation, we compare the performance of the response generated by the language model corresponding to each prompt in the test dataset' and 'randomly sample 300 prompts from the test set' but no explicit train/validation split details for the models used or the overall datasets.
Hardware Specification Yes For all experimentation, we use two Nvidia RTX A6000 GPUs.
Software Dependencies Yes We run all experiments with Python 3.7.4 and Py Torch 1.9.0.
Experiment Setup Yes For implementation, we set the number of tokens sampled k = 10 and the decoding alignment parameter α = 1. We report ablations in Appendix J.3.