Flaws can be Applause: Unleashing Potential of Segmenting Ambiguous Objects in SAM

Authors: Chenxin Li, Yuzhihuang , WUYANG LI, Hengyu Liu, Xinyu Liu, Qing Xu, Zhen Chen, Yue Huang, Yixuan Yuan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiment4.1 Experimental SetupTab. 1 presents the quantitative results on four datasets
Researcher Affiliation Academia 1The Chinese University of Hong Kong 2Xiamen University 3University of Nottingham Ningbo China 4Yale University
Pseudocode No The paper describes the methodology and training pipeline in text and diagrams (Figure 2), but does not include a dedicated pseudocode or algorithm block.
Open Source Code Yes Project page: https://a-sa-m.github.io/. All utilized data are sourced from open-access platforms. The code, which will be made publicly available, is uploaded as a zip file.
Open Datasets Yes Four datasets are utilized for comparison. The LIDC-IDRI dataset [2] is used for lung lesion segmentation... The Bra TS 2017 dataset [18] is used for 3D brain tumor segmentation... The ISBI 2016 dataset [14] contains 900 dermoscopic images for training... The SIM 10k dataset [19] consists of 10,000 images rendered by the gaming engine Grand Theft Auto...
Dataset Splits Yes The ISBI 2016 dataset [14] contains 900 dermoscopic images for training and 379 images for testing
Hardware Specification No The paper mentions 'computational resources' in the introduction but does not specify any exact hardware details such as CPU/GPU models, memory, or specific computing environments used for the experiments.
Software Dependencies No The paper mentions using the Adam optimizer and specific learning rates, but does not provide specific software dependencies with version numbers for libraries like PyTorch, TensorFlow, or Python.
Experiment Setup Yes All three datasets are optimized using the Adam optimizer, with a learning rate of 1e-4, over 100 epochs. For the SIM 10k dataset, we select images where pixels from two instances overlap, creating three potential masks. Optimization for this dataset is carried out with Adam optimizer over 500 epochs, with a learning rate of 1e-4. The trade-off coefficients are set as αP = αI = 1. In our experimental design, we set the number of mask weights W to 8 and initialize each weight to 1/8.