DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Authors: Yung-Sung Chuang, Yujia Xie, Hongyin Luo, Yoon Kim, James R. Glass, Pengcheng He

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on Truthful QA (Lin et al., 2022) and FACTOR Muhlgay et al. (2023) demonstrate that Do La is able to increase the truthfulness of the models of the LLa MA family (Touvron et al., 2023).
Researcher Affiliation Collaboration Massachusetts Institute of Technology, Microsoft yungsung@mit.edu, yujiaxie@microsoft.com {hyluo,yoonkim,glass}@mit.edu, herbert.he@gmail.com
Pseudocode No The paper describes its method using text and mathematical equations, but does not include a clearly labeled pseudocode or algorithm block.
Open Source Code Yes The source code is available at https://github.com/voidism/DoLa.
Open Datasets Yes For multiple choices, we use Truthful QA (Lin et al., 2022) and FACTOR (News/Wiki) (Muhlgay et al., 2023) to assess LMs factuality in short-answer/long-paragraph settings, respectively.
Dataset Splits Yes We use either two-fold validation (Truthful QA-MC, FACTOR) or a validation set (GSM8K, Strategy QA) to select the best bucket.
Hardware Specification Yes We run all the experiments with NVIDIA V100 GPUs on the machines equipped with 40-core CPUs of Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHZ.
Software Dependencies No The paper mentions using the 'Huggingface Transformers package' and 'Huggingface accelerate package' but does not specify their version numbers.
Experiment Setup Yes We set adaptive plausibility constraint (α) to 0.1 and repetition penalty (θ) to 1.2 as per prior studies(Li et al., 2022; Keskar et al., 2019).