reproducibilityindex.ai

HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding

Authors: Zhaorun Chen, Zhuokai Zhao, Hongyin Luo, Huaxiu Yao, Bo Li, Jiawei Zhou

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental studies demonstrate the effectiveness of HALC in reducing OH, outperforming state-of-the-arts across four benchmarks. Code is released at https: //github.com/Bill Chan226/HALC.
Researcher Affiliation	Academia	1University of Chicago, Chicago IL, USA 2Massachusetts Institute of Technology, Boston MA, USA 3UNCChapel Hill, Chapel Hill NC, USA 4University of Illinois at Urbana-Champaign, Champaign IL, USA 5Toyota Technological Institute at Chicago, Chicago IL, USA.
Pseudocode	Yes	Algorithm 1 HALC Decoding
Open Source Code	Yes	Code is released at https: //github.com/Bill Chan226/HALC.
Open Datasets	Yes	We evaluate HALC on three benchmarks including (1) quantitative metrics CHAIR (Rohrbach et al., 2018) and POPE (Li et al., 2023) on MSCOCO (Lin et al., 2014) dataset; (2) general-purposed Multimodal Large Language Model Evaluation (MME) (Fu et al., 2023) benchmark; and (3) qualitative evaluation benchmark LLa VABench (Liu et al., 2023a).
Dataset Splits	Yes	Following existing evaluation procedures (Huang et al., 2023; Yin et al., 2023; Liu et al., 2023b), we randomly sampled 500 images from the validation split of MSCOCO (Lin et al., 2014) and conduct evaluations with both CHAIR and POPE.
Hardware Specification	No	The paper mentions support from "Research Computing Center at the University of Chicago" and "Google Cloud Research Credits program" but does not specify any particular hardware models (e.g., GPU types, CPU models, memory sizes) used for the experiments.
Software Dependencies	No	The paper mentions software like "spaCy English pipeline" and "Hugging Face Transformers Repository" and models like "Grounding DINO" and "OWLv2", but it does not provide specific version numbers for these software components.
Experiment Setup	Yes	The complete hyper-parameters for HALC in our experiments in 6 is reported in Table 8. Specifically, there are four major hyper-parameters that can actively adjust the effectiveness of HALC to adapt to different task settings: 1. FOV Sampling Distribution: ... 2. Number of Sampled FOVs n: ... 3. JSD Buffer Size m: ... 4. Beam Size k: ...