In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Authors: Shiqi Chen, Miao Xiong, Junteng Liu, Zhengxuan Wu, Teng Xiao, Siyang Gao, Junxian He

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on various knowledge-seeking and hallucination benchmarks demonstrate our approach s consistent effectiveness, for example, achieving up to an 8.6 point improvement on Truthful QA. We believe this study can improve our understanding of hallucinations and serve as a practical solution for hallucination mitigation. Code is publicly available at https://github.com/hkustnlp/Activation Decoding.
Researcher Affiliation Academia 1City University of Hong Kong 2National University of Singapore 3Shanghai Jiao Tong University 4Stanford University 5Penn State University 6HKUST.
Pseudocode Yes The decoding method is illustrated in Figure 4, and the pseudo algorithm is shown in Algorithm 1 of the Appendix.
Open Source Code Yes Code is publicly available at https://github.com/hkustnlp/Activation Decoding.
Open Datasets Yes We evaluate our method on two categories of datasets: truthfulness-related and knowledge-seeking datasets, and consider two types of question-answering settings: multiple-choice and open-ended text generation. We follow Chuang et al. (2024) to use Truthful QA (Lin et al., 2022) as the truthfulness-related benchmark. We conduct both multiple-choice and open-ended text generation tasks on Truthful QA. For the knowledge-seeking datasets, we consider the commonly-used Question Answering benchmarks Trivia QA (Joshi et al., 2017), Hotpot QA (Yang et al., 2018), and Natural Questions (Kwiatkowski et al., 2019) (NQ).
Dataset Splits Yes The hyperparameters used for these models are tuned by 2-fold validation on the respective benchmark separately. ... During our experiments, we tested two approaches: 1) in-domain validation, where we use two-fold validation for the respective benchmark separately (see Table 3), and 2) out-of-domain validation, where we use the Truth*Info metric on Truthful QA as the validation metric and fix these hyperparameters for all other benchmarks.
Hardware Specification Yes Figure 6: Comparison of Inference time on 722 samples from Natural Questions (we randomly sample 20% of the validation set) using LLa MA-2-chat-7B model on a single NVIDIA Tesla A800 80GB GPU.
Software Dependencies No The paper mentions models like LLa MA-2 and Mistral but does not provide specific version numbers for general software dependencies such as programming languages, libraries (e.g., PyTorch, TensorFlow), or operating systems.
Experiment Setup Yes Our method involves two hyperparameters: informative layer l for activation calculations, and factor λ to control entropy s influence on the next token probability distribution. ... We select from a range of intermediate layers based on the model s depth (e.g., [24,26,28,30] for LLa MA-2-chat-7B with 32 layers) and set a range for λ (e.g., [0.4, 0.5, 0.6]).