reproducibilityindex.ai

Segmenting Transparent Objects in the Wild with Transformer

Authors: Enze Xie, Wenjia Wang, Wenhai Wang, Peize Sun, Hang Xu, Ding Liang, Ping Luo

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We benchmark more than 20 recent semantic segmentation methods, demonstrating that Trans2Seg signiﬁcantly outperforms all the CNN-based methods, showing the proposed algorithm s potential ability to solve transparent object segmentation.Code is available in github.com/xieenze/Trans2Seg. [...] 5 Experiments
Researcher Affiliation	Collaboration	1The University of Hong Kong 2Sensetime Retsearch 3Nanjing University 4Huawei Noah s Ark Lab
Pseudocode	No	The paper states 'The pseudo code of small conv head is shown in shown in Figure 4.' However, Figure 4 is a diagram illustrating the network architecture and data flow, not a pseudocode block. No actual pseudocode is presented.
Open Source Code	Yes	Code is available in github.com/xieenze/Trans2Seg.
Open Datasets	Yes	This work presents a new ﬁne-grained transparent object segmentation dataset, termed Trans10Kv2, extending Trans10K-v1, the ﬁrst large-scale transparent object segmentation dataset. [...] Our Trans10K-v2 dataset is based on Trans10K dataset [Xie et al., 2020].
Dataset Splits	Yes	Following Trans10K, we use 5000, 1000 and 4428 images in training, validation and testing respectively.
Hardware Specification	Yes	We use 8 V100 GPUs for all experiments.
Software Dependencies	No	The paper mentions 'We implement Trans2Seg with Pytorch' but does not specify the version number of PyTorch or any other software dependencies with their versions.
Experiment Setup	Yes	For loss optimization, we use Adam optimizer with epsilon 1e-8 and weight decay 1e-4. Batch size is 8 per GPU. We set learning rate 1e-4 and decayed by the poly strategy [Yu et al., 2018] for 50 epochs. [...] For our Trans2Seg, we adopt Transformer architecture and need to keep the shape of learned position embedding same in training/inference, so we directly resize the image to 512 512.