Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection

Authors: Jingjing Li, Wei Ji, Qi Bi, Cheng Yan, Miao Zhang, Yongri Piao, Huchuan Lu, Li cheng

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive evaluations demonstrate the effectiveness of our approach under the weakly-supervised setting. Importantly, our method could also be adapted to work in both fully-supervised and unsupervised paradigms. In each of these scenarios, superior performance has been attained by our approach with comparing to the state-of-the-art dedicated methods. As a by-product, a Cap S dataset is constructed by augmenting existing benchmark training set with additional image tags and captions. Code and dataset are available at https://github.com/jiwei0921/JSM.
Researcher Affiliation Academia Jingjing Li1, , Wei Ji1, ( ), Qi Bi2, Cheng Yan3, Miao Zhang4, Yongri Piao4, Huchuan Lu4,5, Li Cheng1 1University of Alberta, Canada 2Wuhan University, China 3Tianjin University, China 4Dalian University of Technology, China 5Pengcheng Lab, Shenzhen, China
Pseudocode No The paper describes the methodology and processes with figures and textual explanations but does not include structured pseudocode or algorithm blocks with explicit labels like 'Algorithm' or 'Pseudocode'.
Open Source Code Yes Code and dataset are available at https://github.com/jiwei0921/JSM.
Open Datasets Yes As a by-product, a Cap S dataset is constructed by augmenting existing benchmark training set with additional image tags and captions. Code and dataset are available at https://github.com/jiwei0921/JSM.
Dataset Splits No The paper states, 'In train vs. test splits of the datasets, the popular setup of [18, 19, 74] is followed for a fair comparison. Training set consists of 1,485 samples from NJUD and 700 samples from NLPR. The remaining images in these datasets and other public test sets are reserved for testing purposes throughout the experiments.' It clearly defines training and testing sets, but does not explicitly mention a distinct validation set split or its size/proportion.
Hardware Specification Yes The code is implemented in Pytorch toolbox on a PC with a single Tesla P40 GPU.
Software Dependencies No The paper mentions 'Pytorch toolbox', 'Glove [54] word2vec network', and 'Neural Talk2 [30] toolkit', but does not provide specific version numbers for these software components or any other libraries/solvers.
Experiment Setup Yes The model is optimized by Adam with batch size of 10, and the learning rate is set to 1 10 4. During training, we use the standard BCE loss to train the saliency network. Each image is uniformly resized to 352 352 and is performed by randomly rotating and cropping to avoid potential overfitting. Our network is trained in an end-to-end manner and converges around 50 epochs.