HonestLLM: Toward an Honest and Helpful Large Language Model

Authors: Gao Chujie, Siyuan Wu, Yue Huang, Dongping Chen, Qihui Zhang, Zhengyan Fu, Yao Wan, Lichao Sun, Xiangliang Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments conducted on nine prominent LLMs demonstrate a significant improvement in alignment with honesty across all models through the implementation of our proposed enhancements. Particularly noteworthy is the 65.3% enhancement observed in Llama3-8b and the remarkable 124.7% improvement in Mistral-7b, as measured by the H2 (honest and helpful) assessment.
Researcher Affiliation Academia Chujie Gao1, , , Siyuan Wu2, , Yue Huang3, , Dongping Chen2,4, , Qihui Zhang5, Zhengyan Fu2, , Yao Wan2, , Lichao Sun6, , Xiangliang Zhang3, 1MBZUAI 2Huazhong University of Science and Technology 3University of Notre Dame 4University of Washington 5Peking University 6Lehigh University
Pseudocode Yes Algorithm 1 Two-Stage Fine-Tuning of LLMs for Honesty Enhancement
Open Source Code Yes Code is available at https://github.com/Flossiee/HonestyLLM.
Open Datasets Yes We introduce HONESET (Honesty Dataset), the first dataset containing queries that LLMs are unable to solve. ... Overall, we collected a total of 930 queries, carefully curated to ensure a comprehensive dataset representing various categories where LLMs struggle. ... We have provided all the code and data related to this paper, and packaged these resources into a compressed file as supplementary material.
Dataset Splits Yes The number of epochs was determined by monitoring the eval loss, ensuring it decreased steadily without overfitting. We selected the checkpoint with the minimum eval loss to ensure optimal model performance.
Hardware Specification Yes The training process was conducted on a server equipped with two NVIDIA RTX 4090 GPUs, each with 24GB of VRAM.
Software Dependencies No We used LoRA [63] to fine-tune Llama3-8b and Mistral-7b. ... the optimizer was Adam [64] ... We utilized the LLAMA-Factory framework for the training process [65].
Experiment Setup Yes For each model, we adopted the consistent hyperparameter settings. Specifically, we set the model temperature to 0 to ensure productivity and set top-p to 1. For Llama3-70b, Mixtral-8x7b, and Llama2-70b, we use the inference API from Replicate. ... We used LoRA [63] to fine-tune Llama3-8b and Mistral-7b. The rank of Lora was set to 8, the learning rate was e 5, the optimizer was Adam [64], trained for 5 epochs, the batch size was 1, and mixed precision training was used. The training process was conducted on a server equipped with two NVIDIA RTX 4090 GPUs, each with 24GB of VRAM. We utilized the LLAMA-Factory framework for the training process [65].