Pyramidal Feature Shrinking for Salient Object Detection
Authors: Mingcan Ma, Changqun Xia, Jia Li2311-2318
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive quantitative and qualitative experiments demonstrate that the proposed intuitive framework outperforms 14 state-of-the-art approaches on 5 public datasets. |
| Researcher Affiliation | Collaboration | Mingcan Ma1 2, Changqun Xia2 , Jia Li1 2 1State Key Laboratory of Virtual Reality Technology and Systems, SCSE, Beihang University, Beijing, China 2Pengcheng Laboratory, Shenzhen, China {mingcanma, jiali}@buaa.edu.cn, xiachq@pcl.ac.cn |
| Pseudocode | No | The paper provides mathematical expressions (e.g., equations 1-7) but no structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for their methodology. |
| Open Datasets | Yes | Like many previous methods, we choose DUTS-TR, the training set of DUTS (Wang et al. 2017), to train our network, which contains 10,553 images and corresponding annotated maps. |
| Dataset Splits | No | The paper mentions DUTS-TR for training and several datasets for testing, but does not explicitly specify a validation dataset or a train/validation/test split. |
| Hardware Specification | Yes | We use Py Torch to construct our network and train it on a PC with a GTX1080TI GPU. |
| Software Dependencies | No | The paper mentions using 'Py Torch' but does not specify a version number or other software dependencies with their versions. |
| Experiment Setup | Yes | Set batch size to 20, epochs to 50, use Stochastic Gradient Descent (SGD), set the momentum to 0.9, and weight decay to 0.0005. Horizontal flipping, random cropping, and multi-scale input images are used to pre-process the images. Res Net-50 pre-trained on Image Net was used as the backbone. We set the maximum learning rate of the backbone to 0.005 and the other parts to 0.05. And the learning rate first increases and then decreases with the training process. The size of each image is adjusted to 352 x 352 to predict the saliency map without any post-processing. |