reproducibilityindex.ai

Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts

Authors: Zhiwei Lin, Yongtao Wang, Zhi Tang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on the long-tail instance segmentation dataset (LVIS) show that our method surpasses the previous open-ended method on the object detection task and can provide additional instance segmentation masks. Besides, VL-SAM achieves favorable performance on the corner case object detection dataset (CODA), demonstrating the effectiveness of VL-SAM in real-world applications.
Researcher Affiliation	Academia	Zhiwei Lin Yongtao Wang Zhi Tang Wangxuan Institute of Computer Technology, Peking University, China
Pseudocode	No	The paper describes the proposed framework and its components using text and figures, but it does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	No	We do not provide new datasets and will release the demo after the paper is accepted.
Open Datasets	Yes	We evaluate VL-SAM on the LVIS dataset [14], which has a long tail of categories and annotations for over 1000 object categories. ... To further demonstrate the effectiveness of the proposed method in the real-world application, we present the results of VL-SAM on corner case object detection dataset CODA for autonomous driving in Table 2.
Dataset Splits	No	The paper mentions evaluating on "LVIS minival" which is a validation split. However, it does not explicitly provide the overall train/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction, or clearly define how these splits were partitioned for its own specific setup beyond citing the evaluation datasets.
Hardware Specification	Yes	All models are inferred on an 80G A800 machine.
Software Dependencies	No	The paper mentions specific vision-language and segmentation models used (e.g., "Cog VLM-17B," "Vicuna-7B-v1.5," "SAM with ViT-Huge"), but it does not specify foundational software dependencies like programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries with their explicit version numbers.
Experiment Setup	Yes	We set the temperature to 0.8 and top-p for nucleus sampling to 0.1 for Cog VLM-17B.