reproducibilityindex.ai

RGB-D Salient Object Detection via 3D Convolutional Neural Networks

Authors: Qian Chen, Ze Liu, Yi Zhang, Keren Fu, Qijun Zhao, Hongwei Du1063-1071

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on six widely used benchmark datasets demonstrate that RD3D performs favorably against 14 state-of-the-art RGB-D SOD approaches in terms of four key evaluation metrics. Our code will be made publicly available: https://github.com/PPOLYpubki/RD3D.
Researcher Affiliation	Academia	1School of Information Science and Technology, University of Science and Technology of China 2Institut National des Sciences Appliquees de Rennes 3College of Computer Science, Sichuan University 4National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University
Pseudocode	No	The paper describes the methodology using text, block diagrams (Figure 2), and mathematical formulations, but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code will be made publicly available: https://github.com/PPOLYpubki/RD3D.
Open Datasets	Yes	We evaluate our RD3D on six popular public datasets having paired RGB and depth images, including: NJU2K (1,985 pairs) (Ju et al. 2014)), NLPR (1,000 pairs) (Peng et al. 2014), STERE (1,000 pairs) (Niu et al. 2012), DES (135 pairs, also called the RGBD135 dataset in some previous works) (Cheng et al. 2014), SIP (929 pairs) (Fan et al. 2020a) and DUTLF-D (1,200 pairs) (Piao et al. 2019).
Dataset Splits	Yes	Following (Chen and Li 2018; Chen, Li, and Su 2019; Han et al. 2017), we use the same 1,485 pairs from NJU2K and 700 pairs from NLPR for training. The remaining pairs are used for testing. Specially, on the latest DUTLF-D dataset, we follow (Piao et al. 2019; Zhao et al. 2020; Piao et al. 2020; Li et al. 2020; Ji et al. 2020) to add additional 800 pairs from DUTLF-D for training and test on the remaining 400 pairs.
Hardware Specification	Yes	Our framework is implemented based on Py Torch (Paszke et al. 2019) on a workstation with 4 NVIDIA 1080Ti GPUs.
Software Dependencies	No	Our framework is implemented based on Py Torch (Paszke et al. 2019). The paper mentions PyTorch but does not specify a version number or other software dependencies with their versions.
Experiment Setup	Yes	During training, we adopt the Adam optimizer with an initial learning rate of 0.0001, which is decayed by a cosine learning rate scheduler. The weight decay is set to 0.001. The data is ﬁrst resized to [352, 352] and then augmented by random horizontal ﬂip and multi-scale transformation with the scale of {256, 352, 416}. We train for 100 epochs on 4 GPUs with the batch size equals to 10 per GPU, and the total training time is about 6 hours. The model after the last epoch is used for inference. Regarding the supervision, we calculate the typical binary cross-entropy loss. During testing, an image of arbitrary size is ﬁrst resized to [352, 352] and the predicted saliency map is resized back to its original size.