Self-Supervised Pretraining for RGB-D Salient Object Detection

Authors: Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Xiang Ruan3463-3471

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on six benchmark datasets show that our self-supervised pretrained model performs favorably against most state-of-the-art methods pretrained on Image Net.
Researcher Affiliation Collaboration 1 Dalian University of Technology, China 2 Peng Cheng Laboratory, China 3 Tiwaki Co.,Ltd., Japan
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The source code will be publicly available at https://github.com/Xiaoqi-Zhao-DLUT/SSLSOD.
Open Datasets Yes We evaluate the proposed model on six public RGB-D SOD datasets which are RGBD135 (Cheng et al. 2014), DUT-RGBD (Piao et al. 2019), STERE (Niu et al. 2012), NLPR (Peng et al. 2014), NJUD (Ju et al. 2014) and SIP (Fan et al. 2019).
Dataset Splits No The paper describes training and testing sets but does not explicitly provide validation dataset splits.
Hardware Specification Yes Our models are implemented based on the Pytorch and trained on a RTX 2080Ti GPU for 50 epochs with minibatch size 4.
Software Dependencies No The paper mentions "Pytorch" but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes Our models are implemented based on the Pytorch and trained on a RTX 2080Ti GPU for 50 epochs with minibatch size 4. We adopt some data augmentation techniques to avoid overfitting: random horizontally flipping, random rotate, random brightness, saturation and contrast. For the optimizer, we use the stochastic gradient descent (SGD) with a momentum of 0.9 and a weight decay of 0.0005. For the pretext tasks, the learning rate is set to 0.001 and later use the poly policy (Liu, Rabinovich, and Berg 2015) with the power of 0.9 as a means of adjustment. For the downstream task, maximum learning rate is set to 0.005 for backbone and 0.05 for other parts. Warm-up and linear decay strategies are used to adjust the learning rate.