Personalize Segment Anything Model with One Shot
Authors: Renrui Zhang, Zhengkai Jiang, Ziyu Guo, Shilin Yan, Junting Pan, Hao Dong, Yu Qiao, Peng Gao, Hongsheng Li
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate our efficacy, we construct a new dataset, Per Seg, for the evaluation of personalized object segmentation, and also test our methods on various one-shot image and video segmentation benchmarks. ... We first evaluate our approach for personalized segmentation on Per Seg in Section 3.1, along with various existing one-shot segmentation benchmarks in Section 3.2. Then, we illustrate the effectiveness of our Per SAM-assisted Dream Booth in Section 3.3. Finally, we conduct several ablation studies to investigate our designs on Per Seg in Section 3.4. |
| Researcher Affiliation | Collaboration | 1CUHK MMLab 2Shanghai Artificial Intelligence Laboratory 3Institute of Automation, Chinese Academy of Sciences 4CFCS, School of CS, Peking University 5CPII of Inno HK |
| Pseudocode | No | No structured pseudocode or algorithm blocks found. |
| Open Source Code | Yes | Code is released at https://github.com/ZrrSkywalker/Personalize-SAM. |
| Open Datasets | Yes | To demonstrate our efficacy, we construct a new dataset, Per Seg, for the evaluation of personalized object segmentation... The raw images are collected from the training data of subject-driven diffusion works (Ruiz et al., 2022; Gal et al., 2022; Kumari et al., 2022). ... DAVIS 2017 (Pont-Tuset et al., 2017) ... FSS-1000 (Li et al., 2020), LVIS-92i (Gupta et al., 2019), PASCAL-Part (Morabia et al., 2020), and PACO-Part (Ramanathan et al., 2023). |
| Dataset Splits | Yes | Video Object Segmentation. Given the first-frame image and object masks, our Per SAM and Per SAM-F achieve competitive object segmentation and tracking performance on the validation set of DAVIS 2017 (Pont-Tuset et al., 2017) ... For Per SAM-F, we conduct one-shot training for 1,000 epochs with a batch size 1, supervised by the dice loss (Milletari et al., 2016) and focal loss (Lin et al., 2017). |
| Hardware Specification | Yes | and only fine-tune 2 parameters within 10 seconds on a single A100 GPU. |
| Software Dependencies | No | The paper mentions software like Adam W optimizer, dice loss, focal loss, and Open CV, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For Per SAM-F, we conduct one-shot training for 1,000 epochs with a batch size 1, supervised by the dice loss (Milletari et al., 2016) and focal loss (Lin et al., 2017). We set the initial learning rate as 10-3, and adopt an Adam W (Loshchilov & Hutter, 2017) optimizer with a cosine scheduler. ... The balance factor α in Equation 8 is set as 1. |