Self-Supervised Pretraining for RGB-D Salient Object Detection
Authors: Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Xiang Ruan3463-3471
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on six benchmark datasets show that our self-supervised pretrained model performs favorably against most state-of-the-art methods pretrained on Image Net. |
| Researcher Affiliation | Collaboration | 1 Dalian University of Technology, China 2 Peng Cheng Laboratory, China 3 Tiwaki Co.,Ltd., Japan |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code will be publicly available at https://github.com/Xiaoqi-Zhao-DLUT/SSLSOD. |
| Open Datasets | Yes | We evaluate the proposed model on six public RGB-D SOD datasets which are RGBD135 (Cheng et al. 2014), DUT-RGBD (Piao et al. 2019), STERE (Niu et al. 2012), NLPR (Peng et al. 2014), NJUD (Ju et al. 2014) and SIP (Fan et al. 2019). |
| Dataset Splits | No | The paper describes training and testing sets but does not explicitly provide validation dataset splits. |
| Hardware Specification | Yes | Our models are implemented based on the Pytorch and trained on a RTX 2080Ti GPU for 50 epochs with minibatch size 4. |
| Software Dependencies | No | The paper mentions "Pytorch" but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | Our models are implemented based on the Pytorch and trained on a RTX 2080Ti GPU for 50 epochs with minibatch size 4. We adopt some data augmentation techniques to avoid overfitting: random horizontally flipping, random rotate, random brightness, saturation and contrast. For the optimizer, we use the stochastic gradient descent (SGD) with a momentum of 0.9 and a weight decay of 0.0005. For the pretext tasks, the learning rate is set to 0.001 and later use the poly policy (Liu, Rabinovich, and Berg 2015) with the power of 0.9 as a means of adjustment. For the downstream task, maximum learning rate is set to 0.005 for backbone and 0.05 for other parts. Warm-up and linear decay strategies are used to adjust the learning rate. |