Segment Anything in High Quality
Authors: Lei Ke, Mingqiao Ye, Martin Danelljan, Yifan liu, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate the effectiveness of HQ-SAM, we perform extensive quantitative and qualitative experimental analysis. We compare HQ-SAM with SAM on a suite of 10 diverse segmentation datasets across different downstream tasks, where 8 out of them are under a zero-shot transfer protocol, including COCO [31], UVO [42], SGin W [58], LVIS [14], HQ-YTVIS [20], BIG [6], COIFT [29] and HR-SOD [51]. |
| Researcher Affiliation | Academia | 1ETH Zürich 2HKUST 3Dartmouth College |
| Pseudocode | No | The paper describes the model architecture and process with text and diagrams, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code and pretrained models are at https://github.com/Sys CV/SAM-HQ. |
| Open Datasets | Yes | To train HQ-SAM in a data-efficient manner, instead of further training on SA-1B [21], we compose a new training dataset HQSeg-44K which contains 44,320 extremely accurate image mask annotations. ... HQSeg-44K leverages a collection of six existing image datasets including DIS [35] (train set), Thin Object-5K [29] (train set), FSS-1000 [26], ECSSD [38], MSRA10K [8], DUT-OMRON [46] with extremely fine-grained mask labeling... |
| Dataset Splits | Yes | For ablation experiments, we use the four aforementioned extremely accurate segmentation datasets, namely, DIS (val) [35], Thin Object-5K (test) [29], COIFT [29] and HR-SOD [51] as well as the COCO validation set. |
| Hardware Specification | Yes | Thanks to the smaller-scale dataset and our minimal integrated architecture, HQ-SAM can be trained in only 4 hours on 8 RTX 3090 GPUs. |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies such as programming languages, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | We use a learning rate of 0.001 and train our HQ-SAM for 12 epochs, with a learning rate drop after 10 epochs. We train on 8 Nvidia Ge Force RTX 3090 GPUs with a total batch size of 32, which takes 4 hours to train for 16.6K iterations. |