reproducibilityindex.ai

I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively

Authors: Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We report the MAD competition results of eleven Image Net classiﬁers while noting that the framework is readily extensible and costeffective to add future classiﬁers into the competition.
Researcher Affiliation	Academia	Haotao Wang Department of Computer Science and Engineering Texas A&M University htwang@tamu.edu Tianlong Chen Department of Computer Science and Engineering Texas A&M University wiwjp619@tamu.edu Zhangyang Wang Department of Computer Science and Engineering Texas A&M University atlaswang@tamu.edu Kede Ma Department of Computer Science City University of Hong Kong kede.ma@cityu.edu.hk
Pseudocode	Yes	Algorithm 1: The MAD competition Input: An unlabeled image set D, a group of image classiﬁers F = {fi}m i=1 to be ranked, a distance measure dw deﬁned over Word Net hierarchy Output: A global ranking vector r Rm
Open Source Code	Yes	Codes can be found at https://github.com/TAMU-VITA/MAD.
Open Datasets	Yes	We focus on Image Net (Deng et al., 2009) for two reasons. First, it is one of the ﬁrst large-scale and widely used datasets in image classiﬁcation.
Dataset Splits	Yes	Although MAD allows us to arbitrarily increase n with essentially no cost, we choose the size of D to be approximately three times larger than the Image Net validation set to provide a relatively easy environment for probing the generalizability of the classiﬁers.
Hardware Specification	No	The paper does not explicitly mention specific hardware specifications like GPU models or CPU types.
Software Dependencies	No	The paper does not mention specific software dependencies with version numbers.
Experiment Setup	Yes	When constructing S using the maximum discrepancy principle, we add another constraint based on prediction conﬁdence. Speciﬁcally, a candidate image x associated with fi and fj is ﬁltered out if it does not satisfy min(pi(x), pj(x)) T, where pi(x) is the conﬁdence score (i.e., probability produced by the last softmax layer) of fi(x) and T is a predeﬁned threshold set to 0.8. We include the conﬁdence constraint for two main reasons. First, if fi misclassiﬁes x with low conﬁdence, it is highly likely that x is near the decision boundary and thus contains less information on improving the decision rules of fi. Second, some images in D do not necessarily fall into the 1, 000 classes in Image Net, which are bound to be misclassiﬁed (a problem closely related to outof-distribution detection). If they are misclassiﬁed by fi with high conﬁdence, we consider them as hard counterexamples of fi. To encourage class diversity in S, we retain a maximum of three images with the same predicted label by fi. In addition, we exclude images that are non-natural. Figure 4 visually compares representative manhole cover images in S and the Image Net validation set (see more in Figure 6).