On Model Calibration for Long-Tailed Object Detection and Instance Segmentation

Authors: Tai-Yu Pan, Cheng Zhang, Yandong Li, Hexiang Hu, Dong Xuan, Soravit Changpinyo, Boqing Gong, Wei-Lun Chao

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate NORCAL on the LVIS [12] dataset for both long-tailed object detection and instance segmentation. NORCAL can consistently improve not only baseline models (e.g., Faster R-CNN [43] or Mask R-CNN [18]) but also many models that are dedicated to the long-tailed distribution. Hence, our best results notably advance the state of the art. Moreover, NORCAL can improve both the standard average precision (AP) and the category-independent APFixed metric [7], implying that NORCAL does not trade frequent class predictions for rare classes but rather improve the proposal ranking within each class. Indeed, through a detailed analysis, we show that NORCAL can in general improve both the precision and recall for each class, making it appealing to almost any existing evaluation metrics.
Researcher Affiliation Collaboration Tai-Yu Pan1 Cheng Zhang1 Yandong Li2 Hexiang Hu2 Dong Xuan1 Soravit Changpinyo2 Boqing Gong2 Wei-Lun Chao1 1The Ohio State University 2Google Research
Pseudocode No No pseudocode or algorithm block found.
Open Source Code Yes Our code is publicly available at https://github.com/tydpan/Nor Cal.
Open Datasets Yes We validate NORCAL on the LVIS v1 dataset [12], a benchmark dataset for large-vocabulary instance segmentation which has 100K/19.8K/19.8K training/validation/test images.
Dataset Splits Yes We validate NORCAL on the LVIS v1 dataset [12], a benchmark dataset for large-vocabulary instance segmentation which has 100K/19.8K/19.8K training/validation/test images. ... All results are reported on the validation set of LVIS v1.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) are provided.
Software Dependencies No No specific software dependencies with version numbers are mentioned.
Experiment Setup Yes We apply NORCAL to post-calibrate several representative baseline models, for which we use the released checkpoints from the corresponding papers. ... For NORCAL, (a) we investigate different mechanisms by applying post-calibration to the classifier logits, exponentials, or probabilities (cf. Eq. 4); (b) we study different types of calibration factor ac, using the class-dependent temperature (CDT) [61] presented in Eq. 5 or the effective number of samples (ENS) [6]; (c) we compare with or without score normalization. We tune the only hyper-parameter of NORCAL (i.e., in ac) on training data.