In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation
Authors: Shiqi Chen, Miao Xiong, Junteng Liu, Zhengxuan Wu, Teng Xiao, Siyang Gao, Junxian He
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various knowledge-seeking and hallucination benchmarks demonstrate our approach s consistent effectiveness, for example, achieving up to an 8.6 point improvement on Truthful QA. We believe this study can improve our understanding of hallucinations and serve as a practical solution for hallucination mitigation. Code is publicly available at https://github.com/hkustnlp/Activation Decoding. |
| Researcher Affiliation | Academia | 1City University of Hong Kong 2National University of Singapore 3Shanghai Jiao Tong University 4Stanford University 5Penn State University 6HKUST. |
| Pseudocode | Yes | The decoding method is illustrated in Figure 4, and the pseudo algorithm is shown in Algorithm 1 of the Appendix. |
| Open Source Code | Yes | Code is publicly available at https://github.com/hkustnlp/Activation Decoding. |
| Open Datasets | Yes | We evaluate our method on two categories of datasets: truthfulness-related and knowledge-seeking datasets, and consider two types of question-answering settings: multiple-choice and open-ended text generation. We follow Chuang et al. (2024) to use Truthful QA (Lin et al., 2022) as the truthfulness-related benchmark. We conduct both multiple-choice and open-ended text generation tasks on Truthful QA. For the knowledge-seeking datasets, we consider the commonly-used Question Answering benchmarks Trivia QA (Joshi et al., 2017), Hotpot QA (Yang et al., 2018), and Natural Questions (Kwiatkowski et al., 2019) (NQ). |
| Dataset Splits | Yes | The hyperparameters used for these models are tuned by 2-fold validation on the respective benchmark separately. ... During our experiments, we tested two approaches: 1) in-domain validation, where we use two-fold validation for the respective benchmark separately (see Table 3), and 2) out-of-domain validation, where we use the Truth*Info metric on Truthful QA as the validation metric and fix these hyperparameters for all other benchmarks. |
| Hardware Specification | Yes | Figure 6: Comparison of Inference time on 722 samples from Natural Questions (we randomly sample 20% of the validation set) using LLa MA-2-chat-7B model on a single NVIDIA Tesla A800 80GB GPU. |
| Software Dependencies | No | The paper mentions models like LLa MA-2 and Mistral but does not provide specific version numbers for general software dependencies such as programming languages, libraries (e.g., PyTorch, TensorFlow), or operating systems. |
| Experiment Setup | Yes | Our method involves two hyperparameters: informative layer l for activation calculations, and factor λ to control entropy s influence on the next token probability distribution. ... We select from a range of intermediate layers based on the model s depth (e.g., [24,26,28,30] for LLa MA-2-chat-7B with 32 layers) and set a range for λ (e.g., [0.4, 0.5, 0.6]). |