Semi-Supervised Video Salient Object Detection Based on Uncertainty-Guided Pseudo Labels
Authors: Yongri Piao, Chenyang Lu, Miao Zhang, Huchuan Lu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that our methods outperform existing semi-supervised method and partial fully-supervised methods across five public benchmarks of DAVIS, FBMS, MCL, Vi Sal, and Seg Track-V2. |
| Researcher Affiliation | Academia | Yongri Piao1, , Chenyang Lu1, , Miao Zhang1( ), Huchuan Lu1,2 1Dalian University of Technology, China 2Pengcheng Lab, Shenzhen, China yrpiao@dlut.edu.cn, luchenyang0724@mail.dlut.edu.cn, {miaozhang, lhchuan}@dlut.edu.cn |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and dataset are available at https://github.com/Lanezzz/UGPL. |
| Open Datasets | Yes | To evaluate the performance of our method, we conduct experiments on five widely-used VSOD datasets for fair comparisons. DAVIS [43] is the most popular VSOD dataset, with 50 high-quality fully annotated video sequences. FBMS [39] includes 59 natural video sequences... Vi Sal [51] is the first specially collected dataset for VSOD. MCL [26] includes 9 video sequences... Seg Track-V2 [27] is the earliest VOS dataset... |
| Dataset Splits | No | The whole dataset is split into 30 sequences (2079 frames) for training and 20 sequences (1376 frames) for testing. (for DAVIS) and The whole dataset is split into 29 sequences (353 frames) for training and 30 sequences (720 frames) for testing. (for FBMS) (The paper specifies train/test splits but does not explicitly mention a separate validation split with details). |
| Hardware Specification | Yes | Our network is implemented on the Pytorch framework with 4 GTX 1080Ti GPU and it is also adapted to the Mind Spore framework of Huawei with an Ascend-910. |
| Software Dependencies | No | Our network is implemented on the Pytorch framework with 4 GTX 1080Ti GPU and it is also adapted to the Mind Spore framework of Huawei with an Ascend-910. (No version numbers provided for software frameworks/libraries). |
| Experiment Setup | Yes | For the training of UGPLG, the initial learning rate is set as 0.005 and decays 0.1 times every 25 epochs with a batch size of 8. For the training of NS-GAN, the initial learning rate is set as 0.015 and decays 0.1 times every 20 epochs. Images are uniformly resized to 448 448. We adopt an SGD optimizer in which the momentum and weight decay are set to 0.9, 5e-4. In the pre-train phase for NS-GAN, we pretrain the RGB branch on DUTS [49] which is a commonly used static-image SOD dataset. The initial learning rate is set as 0.01 and decays 0.1 times every 30 epochs. |