Segment Anything without Supervision
Authors: XuDong Wang, Jingfeng Yang, Trevor Darrell
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluated across seven popular datasets, Un SAM achieves competitive results with the supervised counterpart SAM, and surpasses the previous state-of-the-art in unsupervised segmentation by 11% in terms of AR. |
| Researcher Affiliation | Academia | Xu Dong Wang Jingfeng Yang Trevor Darrell UC Berkeley |
| Pseudocode | Yes | Algorithm 1 Divide and Conquer |
| Open Source Code | Yes | code: https://github.com/frank-xwang/Un SAM |
| Open Datasets | Yes | MSCOCO [24], LVIS [15], SA-1B [21], ADE [48], Entity [29], Part Image Net [16] and PACO [30]. |
| Dataset Splits | No | The paper mentions training on a percentage of SA-1B and evaluating on COCO Val2017 and SA-1B test set, but does not explicitly state validation dataset splits for its training process, apart from using standard evaluation sets which often serve as de-facto validation sets for benchmarking. |
| Hardware Specification | Yes | All model training in this paper was conducted using either 4 A100 GPUs or 8 RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions software components and models like DINO, Mask2Former, and Semantic-SAM, but does not provide specific version numbers for them or other dependencies. |
| Experiment Setup | Yes | In the divide stage, we set the confidence threshold τ=0.3; in the conquer stage, we choose threshold θmerge = [0.6, 0.5, 0.4, 0.3, 0.2, 0.1]. ... The default learning rate is 5e-5 with a batch size of 16 and a weight decay of 5e-2. We train the model for 8 epochs. ... The default learning rate is 1e-4 with a batch size of 8. The learning rate decreases by a factor of 10 at 90% and 95% of the training iterations. We train the model for 4 epochs. |