Weakly Supervised Scene Parsing with Point-Based Distance Metric Learning

Authors: Rui Qian, Yunchao Wei, Honghui Shi, Jiachen Li, Jiaying Liu, Thomas Huang8843-8850

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on two challenging scene parsing benchmarks of PASCALContext and ADE 20K to validate the effectiveness of our PDML, and competitive m Io U scores are achieved.
Researcher Affiliation Collaboration Rui Qian,1,3 Yunchao Wei,1 Honghui Shi,2,1 Jiachen Li,1 Jiaying Liu,3 Thomas Huang1 1IFP Group, Beckman Institute, UIUC, 2IBM Research, 3Institute of Computer Science and Technology, Peking University
Pseudocode Yes Algorithm 1: Optimizing Procedure for PDML
Open Source Code No Our code will be available publicly.
Open Datasets Yes Our proposed model is trained and evaluated on two challenging scene parsing datasets: PASCAL-Context (Mottaghi et al. 2014) and ADE 20K (Zhou et al. 2017)
Dataset Splits Yes We evaluate different methods quantitatively by using pixel accuracy and m Io U which describes the the precision of prediction and the average performance among all classes, respectively. Table 2: Statistical results of two datasets. Dataset # Training # Eval PASCAL-Context 4998 5105 ADE20K 20210 2000. Table 3: Quantitative results on PASCAL-Context and ADE 20K validation dataset.
Hardware Specification Yes All the experiments are conducted on two NVIDIA V100 GPUs.
Software Dependencies No The paper mentions using "Res Net101" and that "Weights pretrained on Image Net are adopted to initialize", but it does not provide specific version numbers for any software dependencies like deep learning frameworks (e.g., TensorFlow, PyTorch) or other libraries.
Experiment Setup Yes During training, we take a minibatch of 16 images and randomly crop patches of the size of 321 × 321 from original images. We use the optimizer of SGD where momentum is set to 0.9 and the weight decay is 0.0005. The initial base learning rate is set to 0.00025 for parameters in the feature extraction layers and ten times for parameters in the classification module. Both learning rate will be decayed under the scheme of base lr × (1 − epoch / max epoch)0.8. We set m = 20, α = 0.8, β = 1 in practice.