Weakly Supervised Salient Object Detection Using Image Labels
Authors: Guanbin Li, Yuan Xie, Liang Lin
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that our proposed method greatly outperforms all state-of-the-art unsupervised saliency detection methods and can be comparable to the current best strongly-supervised methods training with thousands of pixel-level saliency map annotations on all public benchmarks. |
| Researcher Affiliation | Collaboration | Guanbin Li,1 Yuan Xie,1 Liang Lin1,2 1School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China 2Sense Time Group Limited |
| Pseudocode | Yes | Algorithm 1 Saliency Annotations Updating Require: Current saliency map annotation Sanno, the predicted saliency map Spredict, CRF output of current saliency map annotation Canno, CRF output of the predicted saliency map Cpredict and CRF output of the class activation map Ccam Ensure: The updated saliency map annotation Supdate. 1: if MAE (Canno, Cpredict) α then 2: Supdate = CRF Sanno+Spredict 3: else if MAE (Canno, Ccam) > β and MAE (Cpredict, Ccam) > β then 4: Discard the training sample in next iteration 5: else if MAE (Canno, Ccam) MAE (Cpredict, Ccam) then 6: Supdate = Canno 7: else 8: Supdate = Cpredict 9: end if |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | In the first stage, we train on Microsoft COCO object detection dataset for multi-label recognition, which comprises a training set of 82,783 images, and a validation set of 40,504 images. The dataset covers 80 common object categories, with about 3.5 object labels per image. In the second stage, we combine the training images of both the MSRAB dataset (2500 images) (Liu et al. 2011) and the HKU-IS dataset (2500 images) (Li and Yu 2016b) as our training set (5000 images), with all original saliency annotations removed. |
| Dataset Splits | Yes | In the first stage, we train on Microsoft COCO object detection dataset for multi-label recognition, which comprises a training set of 82,783 images, and a validation set of 40,504 images. The dataset covers 80 common object categories, with about 3.5 object labels per image. In the second stage, we combine the training images of both the MSRAB dataset (2500 images) (Liu et al. 2011) and the HKU-IS dataset (2500 images) (Li and Yu 2016b) as our training set (5000 images), with all original saliency annotations removed. The validation sets without annotations in the aforementioned two datasets are also combined as our validation set (1000 images). |
| Hardware Specification | Yes | A GTX Titan X GPU is used for both training and testing. |
| Software Dependencies | No | Our proposed Multi-FCN has been implemented on the public Deep Lab code base (Chen et al. 2014). No specific version numbers are provided for Deep Lab or other software dependencies. |
| Experiment Setup | Yes | During training, the mini-batch size is set to 2 and we choose to update the loss every 5 iterations. We set the momentum parameter to 0.9 and the weight decay to 0.0005 for both subtasks. The total number of iteration is set to 8K during each training round. During saliency annotation updating, the thresholds α and β are set to 15 and 40 respectively. The mean MAE of the training stop criteria is set to 0.05 in our experiment. |