reproducibilityindex.ai

Unleashing Region Understanding in Intermediate Layers for MLLM-based Referring Expression Generation

Authors: Yaoyuan Liang, Zhuojun Cai, Jian Xu, Guanbo Huang, Yiran Wang, Xiao Liang, Jiahao Liu, Ziran Li, Jingang Wang, Shao-Lun Huang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments conducted on the Ref COCOg and PHD benchmarks show that our proposed framework could outperform existing methods on both semantic and hallucination-related metrics.
Researcher Affiliation	Collaboration	1Tsinghua Shenzhen International Graduate School, Tsinghua University 2Meituan Inc.
Pseudocode	Yes	Algorithm 1 Layer Prior Importance Calculation
Open Source Code	Yes	Code will be made available in https://github.com/Glupayy/unleash-eliminate.
Open Datasets	Yes	Extensive experiments conducted on the Ref COCOg [33] and PHD [28] benchmark
Dataset Splits	Yes	we randomly extracted K = 2000 samples from the Ref COCOg training set to form the triplets (I, M, Y)... Extensive experiments conducted on the Ref COCOg [33] and PHD [28] benchmark
Hardware Specification	No	The paper does not specify particular GPU or CPU models, memory, or other detailed hardware components used for the experiments. It only mentions 'GPUs' in the NeurIPS checklist response.
Software Dependencies	No	The paper mentions models like Osprey-7b and GLaMM, but it does not specify any software dependencies (e.g., Python, PyTorch, or specific library versions) with version numbers.
Experiment Setup	Yes	We included an analysis of the baseline region-level MLLM model, Osprey-7b, performing at both lower (t = 0.2) and higher (t = 0.9) temperature settings... we set α = 0.1 in the implementation... we randomly extracted K = 2000 samples... The first 32 layers (where layer 0 is the embedding layer) of the Osprey-7b model were organized into four groups: [0, 7], [8, 15], [16, 23], and [24, 31].