Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Renovating Names in Open-Vocabulary Segmentation Benchmarks
Authors: Haiwen Huang, Songyou Peng, Dan Zhang, Andreas Geiger
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments |
| Researcher Affiliation | Collaboration | 1 Bosch Io C Lab, University of T ubingen 3 T ubingen AI Center 3 Autonomous Vision Group, University of T ubingen 4 ETH Zurich 5 MPI for Intelligent Systems, T ubingen 6 Bosch Center for Artificial Intelligence |
| Pseudocode | No | The paper describes the method using text and figures (e.g., Fig. 2), but does not contain a formal pseudocode or algorithm block. |
| Open Source Code | Yes | We provide our code and relabelings for several popular segmentation datasets to the research community on our project page: https://andrehuang.github.io/renovate/ . |
| Open Datasets | Yes | We renovate three panoptic segmentation datasets respectively: MS COCO [13], ADE20K [14], and Cityscapes [15]. |
| Dataset Splits | Yes | We train a renaming model for 60k iterations with a batch size of 16 on the training set, then generate RENOVATE names for the entire dataset. |
| Hardware Specification | Yes | Training one renaming model on 4 A-100 GPUs requires approximately 3 days. |
| Software Dependencies | No | The paper mentions software like GPT-4, CLIP, Mask2Former, CaSED, and AdamW, but does not provide specific version numbers for these or other key software components used in the experiments. |
| Experiment Setup | Yes | We train a renaming model for 60k iterations with a batch size of 16 on the training set |