Contrastive Learning Reduces Hallucination in Conversations
Authors: Weiwei Sun, Zhengliang Shi, Shen Gao, Pengjie Ren, Maarten de Rijke, Zhaochun Ren
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on the Wizard of Wikipedia, a public, open-domain knowledgegrounded dialogue benchmark, and assess the effectiveness of Mix CL. Mix CL effectively reduces the hallucination of LMs in conversations and achieves the highest performance among LM-based dialogue agents in terms of relevancy and factuality. ... Our contributions are as follows: ... (iv) Experiments on the Wizard-of-Wikipedia dataset show that Mix CL effectively reduces the hallucinating content produced by the LM and achieves comparable performance to KB-based approaches. ... 6 Experimental Setup ... 7 Experimental Results |
| Researcher Affiliation | Academia | 1Shandong University, Qingdao, China 2University of Amsterdam, Amsterdam, The Netherlands |
| Pseudocode | No | The paper describes its method through text and diagrams (e.g., Figure 3) but does not include formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1We release our code at https://github.com/sunnweiwei/Mix CL. |
| Open Datasets | Yes | We conduct experiments on the Wizard of Wikipedia (Wo W) dataset. Wo W is built with crowd-sourcing and employs Wikipedia as the knowledge corpus. ... The ground-truth knowledge used in each turn is manually labeled. ... Dinan et al. 2019 |
| Dataset Splits | No | The paper states, |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions using |
| Experiment Setup | Yes | We determine the hyperparameters through pilot experiments. We set the weight of the language model loss α3 to 0.3 at initialization and linearly decay until 0. We set α1 and α2, i.e., the weight of the MLE loss and MCL loss, to 0.4 and 0.3, respectively, and linearly increase to 0.5 and 0.5. We use greedy decoding in testing. |