RFENet: Towards Reciprocal Feature Evolution for Glass Segmentation

Authors: Ke Fan, Changan Wang, Yabiao Wang, Chengjie Wang, Ran Yi, Lizhuang Ma

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our RFENet achieves state-of-the-art performance on three popular public datasets.
Researcher Affiliation Collaboration 1Shanghai Jiao Tong University 2Tencent Youtu Lab
Pseudocode No None found. The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/VankouF/RFENet.
Open Datasets Yes Glass Datasets: (1) Trans10k [Xie et al., 2020] is a large-scale transparent object segmentation dataset, consisting of 10,428 images with three categories: things, stuff and background. Images are divided into 5,000, 1,000 and 4,428 images for training, validation and test, respectively. (2) GSD [Lin et al., 2021] is a medium-scale glass segmentation dataset containing 4,098 glass images, covering a diversity of indoor and outdoor scenes. All the data are randomly split into a training set with 3,285 images and a test set with 813 images. Mirror Dataset: PMD [Lin et al., 2020] is a large-scale mirror dataset contains 5,096 training images and 571 test images.
Dataset Splits Yes Trans10k... Images are divided into 5,000, 1,000 and 4,428 images for training, validation and test, respectively.
Hardware Specification No We launch the training process on 4 GPUs with synchronized batch normalization, unless otherwise mentioned.
Software Dependencies No We implement RFENet using the PyTorch framework [Paszke et al., 2019].
Experiment Setup Yes For Trans10k dataset, input images are resized to a size of 512 512 for both training and testing. The initial learning rate is set to 0.04, and weight decay is set to 0.0001. We use a mini-batch size of 4 for each GPU and run for 60 epochs. For GSD dataset, following the same setting as [Lin et al., 2021], the input images are firstly resized to 400 400 and then randomly cropped to 384 384. Random flipping is used for training. During inference, the test images are also first resized to 384 384 before fed into the network. The initial learning rate is set to 0.01, and weight decay is set to 0.0005. We run for 80 epochs with a batch size of 6 for each GPU. For PMD dataset, we adopt the same setting as PMDNet [Lin et al., 2020], where the input images are resized to 384 384. The initial learning rate is set to 0.03. The other settings remain the same as those on GSD dataset.