reproducibilityindex.ai

Segment Anything in High Quality

Authors: Lei Ke, Mingqiao Ye, Martin Danelljan, Yifan liu, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate the effectiveness of HQ-SAM, we perform extensive quantitative and qualitative experimental analysis. We compare HQ-SAM with SAM on a suite of 10 diverse segmentation datasets across different downstream tasks, where 8 out of them are under a zero-shot transfer protocol, including COCO [31], UVO [42], SGin W [58], LVIS [14], HQ-YTVIS [20], BIG [6], COIFT [29] and HR-SOD [51].
Researcher Affiliation	Academia	1ETH Zürich 2HKUST 3Dartmouth College
Pseudocode	No	The paper describes the model architecture and process with text and diagrams, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code and pretrained models are at https://github.com/Sys CV/SAM-HQ.
Open Datasets	Yes	To train HQ-SAM in a data-efficient manner, instead of further training on SA-1B [21], we compose a new training dataset HQSeg-44K which contains 44,320 extremely accurate image mask annotations. ... HQSeg-44K leverages a collection of six existing image datasets including DIS [35] (train set), Thin Object-5K [29] (train set), FSS-1000 [26], ECSSD [38], MSRA10K [8], DUT-OMRON [46] with extremely fine-grained mask labeling...
Dataset Splits	Yes	For ablation experiments, we use the four aforementioned extremely accurate segmentation datasets, namely, DIS (val) [35], Thin Object-5K (test) [29], COIFT [29] and HR-SOD [51] as well as the COCO validation set.
Hardware Specification	Yes	Thanks to the smaller-scale dataset and our minimal integrated architecture, HQ-SAM can be trained in only 4 hours on 8 RTX 3090 GPUs.
Software Dependencies	No	The paper does not specify version numbers for any software dependencies such as programming languages, libraries, or frameworks used in the experiments.
Experiment Setup	Yes	We use a learning rate of 0.001 and train our HQ-SAM for 12 epochs, with a learning rate drop after 10 epochs. We train on 8 Nvidia Ge Force RTX 3090 GPUs with a total batch size of 32, which takes 4 hours to train for 16.6K iterations.