Spider: A Unified Framework for Context-dependent Concept Segmentation

Authors: Xiaoqi Zhao, Youwei Pang, Wei Ji, Baicheng Sheng, Jiaming Zuo, Lihe Zhang, Huchuan Lu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Spider significantly outperforms the state-of-the-art specialized models in 8 different context-dependent segmentation tasks, including 4 natural scenes (salient, camouflaged, and transparent objects and shadow) and 4 medical lesions (COVID-19, polyp, breast, and skin lesion with color colonoscopy, CT, ultrasound, and dermoscopy modalities).
Researcher Affiliation Collaboration 1Dalian University of Technology, China 2X3000 Inspection Co., Ltd, China 3Yale University, America.
Pseudocode Yes Algorithm 1 Training and Inference
Open Source Code No The source code will be publicly available at Spider-Uni CDSeg.
Open Datasets Yes The dataset information is shown in Table 1. We follow the training settings of recent state-of-the-art methods in these tasks and merge all training samples together as our training set. Table 1 lists datasets such as DUTS (Wang et al., 2017), COD10K (Fan et al., 2020a), and others, indicating widely used public datasets with citations.
Dataset Splits No Table 1 lists '#Train' and '#Test' datasets but does not provide specific information about a distinct validation split for the datasets used in the experiments.
Hardware Specification Yes All the experiments are implemented on the 8 Tesla A100 GPU for training 50 epochs.
Software Dependencies No The paper mentions using specific optimizers like Adam and backbones like ViT, Swin, and ConvNeXt, but does not provide specific version numbers for these software components or other libraries/frameworks (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes The input resolutions of images are resized to 384 384. For each task, the mini-batch sizes of the input and prompt are set to 4 and 12, respectively. We adopt some basic image augmentation techniques to avoid overfitting, including random flipping, rotating and border clipping. The Adam (Kingma & Ba, 2015) optimizer scheduled by step with initial learning rate of 0.0001, decay size of 30 and decay rate of 0.9 is introduced to update model parameters.