PatchDCT: Patch Refinement for High Quality Instance Segmentation
Authors: Qinrou Wen, Jirui Yang, Xue Yang, Kewei Liang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on COCO show that our method achieves 2.0%, 3.2%, 4.5% AP and 3.4%, 5.3%, 7.0% Boundary AP improvements over Mask-RCNN on COCO, LVIS, and Cityscapes, respectively. It also surpasses DCT-Mask by 0.7%, 1.1%, 1.3% AP and 0.9%, 1.7%, 4.2% Boundary AP on COCO, LVIS and Cityscapes. Besides, the performance of Patch DCT is also competitive with other state-of-the-art methods. |
| Researcher Affiliation | Collaboration | Qinrou Wen1, Jirui Yang2, Xue Yang3, Kewei Liang1, 1School of Mathematical Sciences, Zhejiang University 2Alibaba Group 3Department of CSE, Mo E Key Lab of Artificial Intelligence, Shanghai Jiao Tong University |
| Pseudocode | No | The paper describes its method and pipeline in text and uses Figure 2 to illustrate the pipeline, but it does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Py Torch Code: https://github.com/olivia-w12/Patch DCT |
| Open Datasets | Yes | We evaluate our method on two standard instance segmentation datasets: COCO (Lin et al., 2014) and Cityscapes (Cordts et al., 2016). Following (Kirillov et al., 2020), we also report AP and AP B, which evaluate COCO val2017 with high-quality annotations provided by LVIS (Gupta et al., 2019). |
| Dataset Splits | Yes | Cityscapes is a dataset focused on urban street scenes. It contains 8 categories for instance segmentation, providing 2,975, 500 and 1,525 high-resolution images (1, 024 2, 048) for training, validation, and test respectively. |
| Hardware Specification | Yes | Runtime is measured on a single A100. ... about 1.5 FPS degradation on the A100 GPU. ... Mask-Transifer runs at 5.5 FPS on the A100 GPU |
| Software Dependencies | No | The paper mentions building the model based on DCT-Mask and implementing the algorithm based on Detectron2. It also notes 'Py Torch Code' in the abstract, but it does not specify version numbers for these software components. |
| Experiment Setup | Yes | We set the patch size to 8 and each patch is represented by a 6-dimensional DCT vector. Our model is class-specific by default, i.e. one mask per class. L1 loss and cross-entropy loss are used for DCT vector regression and patch classification respectively. By default, only one Patch DCT module is used, and both λ0 and λ1 are set to 1. We implement our algorithm based on Detectron2 (Wu et al., 2019), and all hyperparameters remain the same as Mask-RCNN in Detectron2. Unless otherwise stated, 1 learning schedule is used. |