reproducibilityindex.ai

A Simple Image Segmentation Framework via In-Context Examples

Authors: Yang Liu, Chenchen Jing, Hengtao Li, Muzhi Zhu, Hao Chen, Xinlong Wang, Chunhua Shen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on various segmentation tasks show the effectiveness of the proposed method. Our code is released at: https://github.com/aim-uofa/SINE
Researcher Affiliation	Collaboration	Yang Liu1, Chenchen Jing1, Hengtao Li1, Muzhi Zhu1 Hao Chen1 , Xinlong Wang3, Chunhua Shen1,2 1Zhejiang University, China 2Ant Group 3Beijing Academy of Artificial Intelligence
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is released at: https://github.com/aim-uofa/SINE
Open Datasets	Yes	Training Data We train our model with a diverse set of segmentation datasets, including semantic, instance, and panoptic segmentation. Specifically, we utilize three visual perception datasets: ADE20K [65] is a popular semantic segmentation dataset... COCO [31] is a widely-used dataset... Objects365 [51] is a large-scale high-quality object detection dataset.
Dataset Splits	Yes	ADE20K [65] is a popular semantic segmentation dataset... It has 25K images, including 20K for training, 2K for validation, and 3K for testing.
Hardware Specification	Yes	Our model is trained for 5 days by using 8 NVIDIA V100 GPUs.
Software Dependencies	No	The paper mentions software like DINOv2 and Adam optimizer, but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	We train SINE about 50K steps with 64 batch sizes. We use Adam [36] optimizer and employ β1 = 0.9, β2 = 0.999 for optimization. We use a linear learning rate scheduler with a base learning rate of 1e 4 and a warmup of 100 steps. The weight decay is set to 0.05. For data augmentation, we use random horizontal flipping and the large-scale jittering (LSJ) [13] augmentation with a random scale sampled from range 0.1 to 2.0 followed by a fixed size crop to 896 896.