Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability

Authors: Zhongxiang Sun, Xiaoxue Zang, Kai Zheng, Jun Xu, Xiao Zhang, Weijie Yu, Yang Song, Han Li

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that Re De EP significantly improves RAG hallucination detection accuracy. Additionally, we introduce AARF, which mitigates hallucinations by modulating the contributions of Knowledge FFNs and Copying Heads. The source code and dataset are available at: https://github.com/Jeryi-Sun/Re DEe P-ICLR.
Researcher Affiliation Collaboration 1Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 2Kuaishou Technology Co., Ltd., Beijing, China 3School of Information Technology and Management, University of International Business and Economics
Pseudocode No The paper describes the methods Re De EP and AARF using descriptive text and mathematical equations, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes The source code and dataset are available at: https://github.com/Jeryi-Sun/Re DEe P-ICLR.
Open Datasets Yes Our experiments on RAGTruth and Dolly (AC) confirm that Re De EP significantly outperforms existing detection methods... We evaluate Re De EP and AARF on two public RAG hallucination datasets. RAGTruth is the first high-quality, manually annotated RAG hallucination dataset... Dolly (AC) is a dataset with Accurate Context obtained from (Hu et al., 2024)...
Dataset Splits Yes For RAGTruth, we use the validation set to select the hyperparameters. For Dolly (AC), we use two-fold validation to select the hyperparameters.
Hardware Specification Yes We run all the experiments on machines equipped with NVIDIA V100 GPUs and 52-core Intel(R) Xeon(R) Gold 6230R CPUs at 2.10GHz.
Software Dependencies Yes We utilize the Huggingface Transformers package to conduct experiments. During the decoding of responses from the language models, we employ greedy search to generate responses. The remaining parameters follow the models default settings... For text chunking, we utilized Lang Chain 1, a popular open-source toolkit, and applied the Recursive Character Text Splitter for the segmentation process.
Experiment Setup Yes For RAGTruth, we use the validation set to select the hyperparameters. For Dolly (AC), we use two-fold validation to select the hyperparameters... For Re De EP(Chunk) on Dolly (AC), on Llama2-7B, we select the top-7 scoring Copying Head and top-3 FFN layers with α = 1 and β = 1.6, as described in Section 3. On Llama2-13B, we select the top-11 scoring Copying Head and top-3 FFN layers with α = 1 and β = 0.2... For AARF, the parameters are relatively simple. We used grid search for τ in the range of (0, 1) with a step size of 0.1. For the weight α2, grid search was performed within the values [1, 2, 5, 10], and for β2, grid search was performed in the range of (0, 1) with a step size of 0.1.