Understanding Multi-Granularity for Open-Vocabulary Part Segmentation
Authors: Jiho Choi, Seonho Lee, Seungho Lee, Minhyun Lee, Hyunjung Shim
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Part CLIPSeg outperforms existing state-of-the-art OVPS methods, offering refined segmentation and an advanced understanding of part relationships within images. |
| Researcher Affiliation | Academia | Jiho Choi1 , Seonho Lee1 , Seungho Lee2, Minhyun Lee2, Hyunjung Shim1 1Graduate School of Artificial Intelligence, KAIST, Republic of Korea 2School of Integrated Technology, Yonsei University, Republic of Korea |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/kaist-cvml/part-clipseg. |
| Open Datasets | Yes | We evaluate our method on three part segmentation datasets: Pascal-Part-116 [7, 46], ADE20K-Part-234 [46, 57], and Part Image Net [21]. |
| Dataset Splits | Yes | Pascal-Part-116 [7, 46] consists of 8,431 training images and 850 test images. ADE20K-Part-234 [46, 57] consists of 7,347 training images and 1,016 validation images. |
| Hardware Specification | Yes | All our experiments are conducted on 8 NVIDIA A6000 GPUs. |
| Software Dependencies | No | The paper mentions using CLIP ViT-B/16 and ADAMW optimizer but does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The model is trained using the ADAMW optimizer with a base learning rate of 0.0001 over 20,000 iterations, with a batch size of 8 images. We employ a Warmup Poly LR learning rate scheduler to manage the learning rate throughout the training process. To ensure model stability, we apply gradient clipping with a maximum gradient norm of 0.01. |