Exploiting Semantic Relations for Glass Surface Detection
Authors: Jiaying Lin, Yuen-Hei Yeung, Rynson Lau
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our model outperforms state-of-the-art works, especially with 42.6% MAE improvement on our proposed GSD-S dataset. |
| Researcher Affiliation | Academia | Jiaying Lin Yuen-Hei Yeung Rynson W. H. Lau Department of Computer Science City University of Hong Kong {jiayinlin5-c, yh.y}@my.cityu.edu.hk, Rynson.Lau@cityu.edu.hk |
| Pseudocode | No | The paper describes the model architecture and modules verbally and with diagrams, but it does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code, dataset, and models are available at https: // jiaying. link/ neurips2022-gsds/ |
| Open Datasets | Yes | In addition, we propose a large-scale glass surface detection dataset named Glass Surface Detection Semantics ( GSD-S ), which contains 4,519 real-world RGB glass surface images from diverse real-world scenes with detailed annotations for both glass surface detection and semantic segmentation. Code, dataset, and models are available at https: // jiaying. link/ neurips2022-gsds/ |
| Dataset Splits | No | The paper states training and testing splits: 'We processed 4,519 images, with 3,911 training images and 608 testing images altogether.' It does not explicitly mention a validation split or its size for the GSD-S dataset in its main text. |
| Hardware Specification | Yes | Kaiming uniform initialization [45] is used before the model was trained on an NVidia RTX 2080Ti GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch s Deeplab V3-Res Net50 model' but does not specify version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | The input data is first uniformly resized to the size of 384 384 before applying normalization. A joint loss, which is a combination of binary cross entropy and Lovász-Softmax loss [46], was used to supervise the intermediate feature maps (i.e. layers 2 and 4) and final output. The prediction evaluation is accompanied by Fully Connected Conditional Random Fields (CRF) [47] technique for binarization refinement. Specifically, with output features produced by SAA and CCA Modules, the decoder which adopts the Feature Pyramid Network structure was configured in accordance with the original paper: Internal Channel Number = {128, 256, 512, 1024}, Linear Layer Dimension = 512, Pooling Scales = {1, 2, 3, 6}. |