Personalize Segment Anything Model with One Shot

Authors: Renrui Zhang, Zhengkai Jiang, Ziyu Guo, Shilin Yan, Junting Pan, Hao Dong, Yu Qiao, Peng Gao, Hongsheng Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate our efficacy, we construct a new dataset, Per Seg, for the evaluation of personalized object segmentation, and also test our methods on various one-shot image and video segmentation benchmarks. ... We first evaluate our approach for personalized segmentation on Per Seg in Section 3.1, along with various existing one-shot segmentation benchmarks in Section 3.2. Then, we illustrate the effectiveness of our Per SAM-assisted Dream Booth in Section 3.3. Finally, we conduct several ablation studies to investigate our designs on Per Seg in Section 3.4.
Researcher Affiliation Collaboration 1CUHK MMLab 2Shanghai Artificial Intelligence Laboratory 3Institute of Automation, Chinese Academy of Sciences 4CFCS, School of CS, Peking University 5CPII of Inno HK
Pseudocode No No structured pseudocode or algorithm blocks found.
Open Source Code Yes Code is released at https://github.com/ZrrSkywalker/Personalize-SAM.
Open Datasets Yes To demonstrate our efficacy, we construct a new dataset, Per Seg, for the evaluation of personalized object segmentation... The raw images are collected from the training data of subject-driven diffusion works (Ruiz et al., 2022; Gal et al., 2022; Kumari et al., 2022). ... DAVIS 2017 (Pont-Tuset et al., 2017) ... FSS-1000 (Li et al., 2020), LVIS-92i (Gupta et al., 2019), PASCAL-Part (Morabia et al., 2020), and PACO-Part (Ramanathan et al., 2023).
Dataset Splits Yes Video Object Segmentation. Given the first-frame image and object masks, our Per SAM and Per SAM-F achieve competitive object segmentation and tracking performance on the validation set of DAVIS 2017 (Pont-Tuset et al., 2017) ... For Per SAM-F, we conduct one-shot training for 1,000 epochs with a batch size 1, supervised by the dice loss (Milletari et al., 2016) and focal loss (Lin et al., 2017).
Hardware Specification Yes and only fine-tune 2 parameters within 10 seconds on a single A100 GPU.
Software Dependencies No The paper mentions software like Adam W optimizer, dice loss, focal loss, and Open CV, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes For Per SAM-F, we conduct one-shot training for 1,000 epochs with a batch size 1, supervised by the dice loss (Milletari et al., 2016) and focal loss (Lin et al., 2017). We set the initial learning rate as 10-3, and adopt an Adam W (Loshchilov & Hutter, 2017) optimizer with a cosine scheduler. ... The balance factor α in Equation 8 is set as 1.