Modeling Multimodal Aleatoric Uncertainty in Segmentation with Mixture of Stochastic Experts

Authors: Zhitong Gao, Yucong Chen, Chuyu Zhang, Xuming He

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our method on the LIDC-IDRI dataset and a modified multimodal Cityscapes dataset. Results demonstrate that our method achieves the state-of-the-art or competitive performance on all metrics. 1
Researcher Affiliation Academia 1Shanghai Tech University, Shanghai, China 2Lingang Laboratory, Shanghai, China 3Shanghai Engineering Research Cente of Intelligent Vision and Imaging, Shanghai, China {gaozht,chenyc,zhangchy2,hexm}@shanghaitech.edu.cn
Pseudocode Yes The overall training procedure is shown in algorithm 1. We use the weighted version of our nonparametric representation during training, as displayed in lines 3 9. [...] Algorithm 1 Training Procedure
Open Source Code Yes The complete source code and trained models are publicly released at https://github.com/ gaozhitong/Mo SE-AUSeg.
Open Datasets Yes We validate our method on the LIDC-IDRI dataset (Armato III et al., 2011) and a modified multimodal Cityscapes dataset (Cordts et al., 2016; Kohl et al., 2018).
Dataset Splits Yes For fair comparison, we use a preprocessed 2D dataset provided by Kohl et al. (2018) with 15096 slices each cropped to 128 128 patches and adopt the 60-20-20 dataset split manner same as Baumgartner et al. (2019); Monteiro et al. (2020).
Hardware Specification Yes We use two NVIDIA TITAN RTX GPUs on the LIDC dataset and four NVIDIA TITAN RTX GPUs on the Cityscapes dataset.
Software Dependencies No The paper mentions 'PyTorch implementation' and 'Adam optimizer' but does not specify version numbers for PyTorch or any other software libraries, compilers, or operating systems used.
Experiment Setup Yes On the LIDC dataset, we use K = 4 experts each with S = 4 samples. In the loss function, we use the Io U (Milletari et al., 2016) as the pair-wise cost function and set the hyperparameters γ0 = 1/2, β = 1 for full-annotation case and β = 10 for one-annotation case. On the Cityscapes dataset, we use a slightly larger number K = 35 of experts each with S = 2 samples. We use the CE as the pair-wise cost function and set the hyperparameters γ0 = 1/32, β = 1 for the loss. To stabilize the training, we adopt a gradient-smoothing trick in some cases and refer the reader to Appendix A.1 for details.