Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Understanding Multi-Granularity for Open-Vocabulary Part Segmentation
Authors: Jiho Choi, Seonho Lee, Seungho Lee, Minhyun Lee, Hyunjung Shim
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Part CLIPSeg outperforms existing state-of-the-art OVPS methods, offering refined segmentation and an advanced understanding of part relationships within images. |
| Researcher Affiliation | Academia | Jiho Choi1 , Seonho Lee1 , Seungho Lee2, Minhyun Lee2, Hyunjung Shim1 1Graduate School of Artificial Intelligence, KAIST, Republic of Korea 2School of Integrated Technology, Yonsei University, Republic of Korea |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/kaist-cvml/part-clipseg. |
| Open Datasets | Yes | We evaluate our method on three part segmentation datasets: Pascal-Part-116 [7, 46], ADE20K-Part-234 [46, 57], and Part Image Net [21]. |
| Dataset Splits | Yes | Pascal-Part-116 [7, 46] consists of 8,431 training images and 850 test images. ADE20K-Part-234 [46, 57] consists of 7,347 training images and 1,016 validation images. |
| Hardware Specification | Yes | All our experiments are conducted on 8 NVIDIA A6000 GPUs. |
| Software Dependencies | No | The paper mentions using CLIP ViT-B/16 and ADAMW optimizer but does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The model is trained using the ADAMW optimizer with a base learning rate of 0.0001 over 20,000 iterations, with a batch size of 8 images. We employ a Warmup Poly LR learning rate scheduler to manage the learning rate throughout the training process. To ensure model stability, we apply gradient clipping with a maximum gradient norm of 0.01. |