reproducibilityindex.ai

Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization

Authors: Xinyu Lyu, Beitao Chen, Lianli Gao, Hengtao Shen, Jingkuan Song

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental research demonstrates that our HIO strategy can effectively reduce hallucinations in LVLMs, outperforming state-of-the-art methods across various benchmarks.
Researcher Affiliation	Academia	1Southwestern University of Finance and Economics, Chengdu, China 2 Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China 3Center for Future Media, University of Electronic Science and Technology of China 4Tongji University 5Engineering Research Center of Intelligent Finance, Ministry of Education
Pseudocode	Yes	Algorithm 1 Training LVLM to Amplify Multiple Targeted Hallucination
Open Source Code	Yes	Code is released at https://github.com/BT-C/HIO.
Open Datasets	Yes	We evaluate HIO on three benchmarks including: (1) Quantitative metrics POPE Li et al. [2023b] on MSCOCO Lin et al. [2014] dataset. (2) CHAIR Rohrbach et al. [2018], Caption Hallucination Assessment with Image Relevance... (3) General-purposed Multimodal Large Language Model Evaluation (MME) Fu et al. [2023] benchmark...
Dataset Splits	Yes	Tab.2 and Tab.5 display results for 500 randomly selected images from the COCO val2017 and val2014 datasets, respectively.
Hardware Specification	Yes	The training is conducted on a robust computational setup: 4x RTX 3090 GPUs for LLa VA 1.5, 8x V100 GPUs for Mini GPT-4, and 4x A6000 GPUs for Instruct BLIP.
Software Dependencies	No	The paper does not specify version numbers for any ancillary software dependencies (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	Hyperparameters including alpha and beta are set to 1.0 and 0.1, respectively, in accordance with the VCD model s specifications.