reproducibilityindex.ai

Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs

Authors: Ma Rong, Jie Chen, Xiangyang Xue, Jian Pu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The results demonstrate that our method significantly outperforms other multi-dataset training methods when trained on seven datasets simultaneously, and achieves state-of-the-art performance on the Wild Dash 2 benchmark.
Researcher Affiliation	Academia	Rong Ma , Jie Chen , Xiangyang Xue, and Jian Pu Fudan University rma22@m.fudan.edu.cn, {chenj19,xyxue,jianpu}@fudan.edu.cn
Pseudocode	Yes	Algorithm 1 The training pipeline of our model
Open Source Code	Yes	Our code can be found in https://github.com/Mrhonor/Auto Uni Seg.
Open Datasets	Yes	Our training datasets cover a wide range of scenarios, from indoor scenes to driving scenes. We also introduce corresponding test datasets, which are not used in the training process, for the respective scenes to evaluate our generalization capability. Datasets mentioned: City Scapes [13], Mapillary [33], BDD [46], IDD [42], SUN RGBD [37], ADE20K [51], COCO [25].
Dataset Splits	No	The paper lists 'Training and Validation datasets' but does not specify the explicit split percentages or sample counts used for validation.
Hardware Specification	Yes	We train our model for 300k iterations on four 80G A100 GPUs.
Software Dependencies	No	The paper mentions using llama-2-7B model and Adam W optimizer, but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup	Yes	Our segmentation model is based on the HRNet-W48 architecture, while the GNN model is a three-layer Graph SAGE. We utilize the llama-2-7B model to encode label descriptions into 4096-dimensional text features. We evenly sample 3 images per dataset within a batch for each GPU. For all images, We first apply random resizing with a ratio ranging from 0.5 to 2, followed by a random crop operation to achieve a final image size of 768 768 pixels. We use Adam W optimizer with warmup and polynomial learning rate decay, starting with a learning rate of 0.0001. We train our model for 300k iterations.