AIMS: All-Inclusive Multi-Level Segmentation for Anything
Authors: Lu Qi, Jason Kuen, Weidong Guo, Jiuxiang Gu, Zhe Lin, Bo Du, Yu Xu, Ming-Hsuan Yang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness and generalization capacity of our method compared to other state-of-the-art methods on a single dataset or the concurrent work on segment anything. |
| Researcher Affiliation | Collaboration | Lu Qi1 Jason Kuen2 Weidong Guo3 Jiuxiang Gu2 Zhe Lin2 Bo Du4 Yu Xu3 Ming-Hsuan Yang1,5 1UC Merced 2Adobe Research 3QQ Brower Lab, Tencent 4 Wuhan University 5 Google Research |
| Pseudocode | No | The paper describes methods and formulas but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We will make our code and training model publicly available. |
| Open Datasets | Yes | We train our AIMS model on existing segmentation datasets such as Pascal Panoptic Parts [6], COCOPSG [8], PACO [9], and Entity Seg [5]. We construct our training set by aggregating images from five segmentation datasets, including COCO [53], Entity Seg [5], Pascal VOC Part (PPP) [6], PACO [9], and COCO-PSG [8]. |
| Dataset Splits | Yes | Initially, we select 1069 and 1000 validation images from PPP [6] (which covers the part and entity levels) and COCO-PSG [8] (which covers the entity and relation levels) respectively. Following this, we eliminate any duplicate images in the unified training set that are present in the validation images, resulting in a refined training set comprised of approximately 236.7K unique images. |
| Hardware Specification | Yes | During each training iteration, we sample the data and tasks as introduced in the sampling strategy of Section 3.4, with a batch size of 64 on 8 A100 GPUs. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used (e.g., Python, PyTorch, TensorFlow, specific deep learning frameworks). |
| Experiment Setup | Yes | We train our model for 36,000 iterations using a base learning rate of 0.0001 and weights pre-trained on COCO-Entity [3] with the exception of images contained in our validation set. The longer edge size of the images is set to 1,333 pixels, while the shorter edge size is randomly sampled between 640 and 800 pixels, with a stride of 32 pixels. The learning rate is decayed by a factor of 0.1 after 28,000 and 33,000 iterations, respectively. During each training iteration, we sample the data and tasks as introduced in the sampling strategy of Section 3.4, with a batch size of 64 on 8 A100 GPUs. |