HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding

Authors: Zhaorun Chen, Zhuokai Zhao, Hongyin Luo, Huaxiu Yao, Bo Li, Jiawei Zhou

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental studies demonstrate the effectiveness of HALC in reducing OH, outperforming state-of-the-arts across four benchmarks. Code is released at https: //github.com/Bill Chan226/HALC.
Researcher Affiliation Academia 1University of Chicago, Chicago IL, USA 2Massachusetts Institute of Technology, Boston MA, USA 3UNCChapel Hill, Chapel Hill NC, USA 4University of Illinois at Urbana-Champaign, Champaign IL, USA 5Toyota Technological Institute at Chicago, Chicago IL, USA.
Pseudocode Yes Algorithm 1 HALC Decoding
Open Source Code Yes Code is released at https: //github.com/Bill Chan226/HALC.
Open Datasets Yes We evaluate HALC on three benchmarks including (1) quantitative metrics CHAIR (Rohrbach et al., 2018) and POPE (Li et al., 2023) on MSCOCO (Lin et al., 2014) dataset; (2) general-purposed Multimodal Large Language Model Evaluation (MME) (Fu et al., 2023) benchmark; and (3) qualitative evaluation benchmark LLa VABench (Liu et al., 2023a).
Dataset Splits Yes Following existing evaluation procedures (Huang et al., 2023; Yin et al., 2023; Liu et al., 2023b), we randomly sampled 500 images from the validation split of MSCOCO (Lin et al., 2014) and conduct evaluations with both CHAIR and POPE.
Hardware Specification No The paper mentions support from "Research Computing Center at the University of Chicago" and "Google Cloud Research Credits program" but does not specify any particular hardware models (e.g., GPU types, CPU models, memory sizes) used for the experiments.
Software Dependencies No The paper mentions software like "spaCy English pipeline" and "Hugging Face Transformers Repository" and models like "Grounding DINO" and "OWLv2", but it does not provide specific version numbers for these software components.
Experiment Setup Yes The complete hyper-parameters for HALC in our experiments in 6 is reported in Table 8. Specifically, there are four major hyper-parameters that can actively adjust the effectiveness of HALC to adapt to different task settings: 1. FOV Sampling Distribution: ... 2. Number of Sampled FOVs n: ... 3. JSD Buffer Size m: ... 4. Beam Size k: ...