Patch Proposal Network for Fast Semantic Segmentation of High-Resolution Images
Authors: Tong Wu, Zhenzhen Lei, Bingqian Lin, Cuihua Li, Yanyun Qu, Yuan Xie12402-12409
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results show that our method achieves almost the best segmentation performance compared with the state-of-the-art segmentation methods and the inference speed is 12.9 fps on Deep Globe and 10 fps on ISIC. |
| Researcher Affiliation | Academia | Tong Wu,1 Zhenzhen Lei,1 Bingqian Lin,3 Cuihua Li,1 Yanyun Qu,1 Yuan Xie2 1Fujian Key Laboratory of Sensing and Computing for Smart City, School of Informatics, Xiamen University, Fujian, China 2School of Computer Science and Technology, East China Normal University, Shanghai, China 3School of Biomedical Engineering, Sun Yat-sen University, Guangzhou, China |
| Pseudocode | No | The paper describes algorithms and training steps in text and flowcharts (e.g., Figure 2 and 3), but it does not provide a distinct pseudocode block or a section explicitly labeled 'Algorithm'. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that open-source code for the described methodology is provided. |
| Open Datasets | Yes | Deep Globe (Demir et al. 2018) is a high-quality satellite dataset focusing on rural areas... ISIC (Tschandl, Rosendahl, and Kittler 2018; Codella et al. 2018) is an ultra-resolution medical dataset... CRAG (Graham et al. 2019; Awan et al. 2017) is a HRI dataset... Cityscapes (Cordts et al. 2016) is a street scene dataset... |
| Dataset Splits | Yes | Deep Globe (Demir et al. 2018) is a high-quality satellite dataset focusing on rural areas, which provides 803 images in 7 classes with 2448 2448 pixels. We randomly divide the dataset into training, validation and testing sets with 455, 206 and 142 images, respectively. ... ISIC (Tschandl, Rosendahl, and Kittler 2018; Codella et al. 2018) is an ultra-resolution medical dataset for pigmented skin lesions, whose training set contains 2077 images, validation set contains 260 images and testing set contains 259 images. ... The CRAG dataset is split into training set and testing set which contain 173 and 40 images. ... Cityscapes (Cordts et al. 2016) ... It contains 3475 fine annotated images with the size of 2048 1024 and 2975 images are used for training and the rest is used for validation. |
| Hardware Specification | Yes | We train our model with 10 batches on a single 1080Ti GPU in a Py Torch framework (Ketkar 2017). |
| Software Dependencies | No | The paper mentions 'Py Torch framework (Ketkar 2017)' and 'Adam (Kingma and Ba 2014)' but does not provide specific version numbers for PyTorch or other critical libraries. |
| Experiment Setup | Yes | In our model, an image with the size of 512 512 is fed into G-branch and then it is uniformly partitioned into 4 4 block to generate 16 patches. We use Adam (Kingma and Ba 2014) with initial learning rate as 1 10 4 to optimize G-branch, R-branch and PPN. The weight decay coefficient and momentum are set to 5 10 4 and 0.9, respectively. The parameter γ in Focal loss functions of G-branch and Rbranch is set to 3. The epochs of pre-training G-branch and maximum number of alternate training are set to 10 and 120, respectively. We train our model with 10 batches on a single 1080Ti GPU in a Py Torch framework (Ketkar 2017). We use the terminal tool gpustat to measure the GPU memory usage. For the fairness of comparison with the state-of-the-art methods, we set the batch size to 1 during inference. |