reproducibilityindex.ai

Multi-Modality Affinity Inference for Weakly Supervised 3D Semantic Segmentation

Authors: Xiawei Li, Qingyuan Xu, Jing Zhang, Tianyi Zhang, Qian Yu, Lu Sheng, Dong Xu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the Scan Net and S3DIS benchmarks verify the effectiveness of our proposed method, which outperforms the state-of-the-art by 4% to 6% m Io U. Codes are released at https://github.com/Sunny599/AAAI24-3DWSSG-MMA.
Researcher Affiliation	Academia	1School of Software, Beihang University 2College of Computer Science and Technology, Zhejiang University 3Department of Computer Science, The University of Hong Kong {ZY2121108,ZY2121121,zhang jing,qianyu,lsheng}@buaa.edu.cn, tianyizhang0213@zju.edu.cn, dongxu@hku.hk
Pseudocode	No	The paper describes the proposed method in the 'Methodology' section using prose and a pipeline diagram (Figure 1), but it does not include any explicit pseudocode blocks or algorithm listings.
Open Source Code	Yes	Codes are released at https://github.com/Sunny599/AAAI24-3DWSSG-MMA.
Open Datasets	Yes	We evaluate the proposed approach MMA on two benchmarks, Scan Net (Dai et al. 2017) and S3DIS (Armeni et al. 2017) datasets. Scan Net is a commonly-used indoor 3D point cloud dataset for semantic segmentation. It contains 1513 training scenes (1201 scenes for training, 312 scenes for validation) and 100 test scenes, annotated with 20 classes. S3DIS is also an indoor 3D point cloud dataset, which contains 6 indoor areas and has 13 classes.
Dataset Splits	Yes	Scan Net is a commonly-used indoor 3D point cloud dataset for semantic segmentation. It contains 1513 training scenes (1201 scenes for training, 312 scenes for validation) and 100 test scenes, annotated with 20 classes. By following the previous work, we use area 5 as the test data.
Hardware Specification	Yes	The model is trained on 3090 GPU with batch size 8 for 300 epochs.
Software Dependencies	No	The paper mentions using "Adam W optimizer" and "Point Net++" as backbone, but it does not specify version numbers for these or other software libraries/frameworks (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	The RGB-appended input point clouds are a set of 10-dimensional vectors, including coordinates (x,y,z), color (R,G,B), surface normal, and height, while the pure geometric input is produced by masking out the RGB values with 0. The model is trained on 3090 GPU with batch size 8 for 300 epochs. We use Adam W optimizer with an initial learning rate of 0.0014 and decay to half at 160 epochs and 180 epochs. All hyper-parameters are tuned based on the validation set.