Importance-Aware Semantic Segmentation in Self-Driving with Discrete Wasserstein Training

Authors: Xiaofeng Liu, Yuzhuo Han, Song Bai, Yi Ge, Tianxing Wang, Xu Han, Site Li, Jane You, Jun Lu11629-11636

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method on Cam Vid and Cityscapes datasets with different backbones (Seg Net, ENet, FCN and Deeplab) in a plug and play fashion. In our extenssive experiments, Wasserstein loss demonstrates superior segmentation performance on the predefined critical classes for safe-driving.
Researcher Affiliation Academia 1Beth Israel Deaconess Medical Center, Harvard Medical School, Harvard University; 2School of Mathematical Sciences, Dalian University of Technology; 3Department of Statistics, University of California, Berkeley; 4Carnegie Mellon University; 5Fudan University; 6Johns Hopkins University 7Department of Computing, The Hong Kong Polytechnic University.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing their code for the work described, nor does it provide a direct link to a source-code repository.
Open Datasets Yes We evaluate our method on Cam Vid and Cityscapes datasets with different backbones (Seg Net, ENet, FCN and Deeplab) in a plug and play fashion. To illustrate the effectiveness of each setting choice and their combinations, we give a series of elaborate ablation studies along with the standard measures. All of the networks are pre-trained with CE loss as their vanilla version. The intersection-over-union (Io U) is defined as: Io U = TP TP + FP + FN (6) where TP, FP, and FN denote the numbers of true positive, false positive, and false negative pixels, respectively. Moreover, the mean Io U is the average of Io U among all classes.
Dataset Splits Yes For training/validation/testing, the recent Cityscapes dataset contains 2975/500/1525 images respectively. The Cam Vid dataset contains 367/26/233 images for training/validation/testing respectively.
Hardware Specification No The paper does not explicitly describe the specific hardware used to run its experiments, such as GPU models, CPU models, or cloud instance types with their specifications.
Software Dependencies No The paper mentions various models and frameworks (Seg Net, ENet, FCN, Deeplab), but it does not provide specific version numbers for any software components or libraries required to replicate the experiments.
Experiment Setup No The paper mentions using specific backbones (Seg Net, ENet, FCN, Deeplab) and that networks are pre-trained with CE loss, and states 'We use the same setting and measurements as IAL' but it does not provide concrete hyperparameters like learning rate, batch size, or optimizer settings for its own training process.