Weakly-Supervised Salient Object Detection Using Point Supervision

Authors: Shuyong Gao, Wei Zhang, Yan Wang, Qianyu Guo, Chenglong Zhang, Yangji He, Wenqiang Zhang670-678

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments on five largest benchmark datasets demonstrate our method outperforms the previous state-of-the-art methods trained with the stronger supervision and even surpass several fully supervised state-of-the-art models.
Researcher Affiliation Academia 1Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University 2Academy for Engineering & Technology, Fudan University {sygao18,weizh,yanwang19,qyguo20,clzhang20,yjhe20,wqzhang}@fudan.edu.cn
Pseudocode Yes Algorithm 1: Flood Filling Algorithm Input: Seed point (x, y), image I, seted value α, old value I(x, y) Output: Filled mask M 1: flood filling algorithm ((x,y), I, α) 2: if x 0 and x < width and y 0 and y < height 3: and a < I(x, y) old < b and I(x, y) = α then 4: M(x,y) α 5: flood filling ((x + 1, y), I, α) 6: flood filling ((x 1, y), I, α) 7: flood filling ((x, y + 1), I, α) 8: flood filling ((x, y 1), I, α) 9: end if
Open Source Code Yes The code is available at: https://github.com/shuyonggao/PSOD.
Open Datasets Yes To minimize the labeling time consumption while providing location information of salient objects, we build a Point-supervised Dataset (P-DUTS) by relabeling DUTS (Wang et al. 2017) dataset, a widely used saliency detection dataset containing 10553 training images. ... To evaluate the performance, we experiment on five public used benchmark datasets: ECSSD (Yan et al. 2013), PASCAL-S (Li et al. 2014), DUT-O (Yang et al. 2013), HKU-IS (Li and Yu 2015), and DUTS-test.
Dataset Splits No The paper states that the P-DUTS dataset is used as the training set and evaluates on separate benchmark test datasets. However, it does not explicitly provide specific train/validation/test dataset splits (e.g., percentages or counts) or mention a dedicated validation set for hyperparameter tuning.
Hardware Specification Yes We train on four TITAN Xp GPUs.
Software Dependencies No The paper states the model is "implemented on the Pytorch toolbox" but does not provide a specific version number for Pytorch or any other software dependencies.
Experiment Setup Yes The maximum learning rate is set to 2.5 10 4 for the transformer part and 2.5 10 3 for other parts. Warm-up and linear decay strategies are used to adjust the learning rate. Stochastic gradient descent (SGD) is used to train the network, and the following hyper-parameters are used: momentum=0.9, weight decay=5 10 4. Horizontal flip and random crop are used as data augmentation. The batch size is set to 28 and it takes 20 epochs for the first training procedure. The hyperparameter γ of Eq. 1 is set to be 5. The second round of training uses the same parameters, but the masks are replaced with refined ones.