I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively

Authors: Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We report the MAD competition results of eleven Image Net classifiers while noting that the framework is readily extensible and costeffective to add future classifiers into the competition.
Researcher Affiliation Academia Haotao Wang Department of Computer Science and Engineering Texas A&M University htwang@tamu.edu Tianlong Chen Department of Computer Science and Engineering Texas A&M University wiwjp619@tamu.edu Zhangyang Wang Department of Computer Science and Engineering Texas A&M University atlaswang@tamu.edu Kede Ma Department of Computer Science City University of Hong Kong kede.ma@cityu.edu.hk
Pseudocode Yes Algorithm 1: The MAD competition Input: An unlabeled image set D, a group of image classifiers F = {fi}m i=1 to be ranked, a distance measure dw defined over Word Net hierarchy Output: A global ranking vector r Rm
Open Source Code Yes Codes can be found at https://github.com/TAMU-VITA/MAD.
Open Datasets Yes We focus on Image Net (Deng et al., 2009) for two reasons. First, it is one of the first large-scale and widely used datasets in image classification.
Dataset Splits Yes Although MAD allows us to arbitrarily increase n with essentially no cost, we choose the size of D to be approximately three times larger than the Image Net validation set to provide a relatively easy environment for probing the generalizability of the classifiers.
Hardware Specification No The paper does not explicitly mention specific hardware specifications like GPU models or CPU types.
Software Dependencies No The paper does not mention specific software dependencies with version numbers.
Experiment Setup Yes When constructing S using the maximum discrepancy principle, we add another constraint based on prediction confidence. Specifically, a candidate image x associated with fi and fj is filtered out if it does not satisfy min(pi(x), pj(x)) T, where pi(x) is the confidence score (i.e., probability produced by the last softmax layer) of fi(x) and T is a predefined threshold set to 0.8. We include the confidence constraint for two main reasons. First, if fi misclassifies x with low confidence, it is highly likely that x is near the decision boundary and thus contains less information on improving the decision rules of fi. Second, some images in D do not necessarily fall into the 1, 000 classes in Image Net, which are bound to be misclassified (a problem closely related to outof-distribution detection). If they are misclassified by fi with high confidence, we consider them as hard counterexamples of fi. To encourage class diversity in S, we retain a maximum of three images with the same predicted label by fi. In addition, we exclude images that are non-natural. Figure 4 visually compares representative manhole cover images in S and the Image Net validation set (see more in Figure 6).