Examples are not enough, learn to criticize! Criticism for Interpretability

Authors: Been Kim, Rajiv Khanna, Oluwasanmi O. Koyejo

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A human subject pilot study shows that the MMD-critic selects prototypes and criticism that are useful to facilitate human understanding and reasoning. We also evaluate the prototypes selected by MMD-critic via a nearest prototype classifier, showing competitive performance compared to baselines.
Researcher Affiliation Collaboration Been Kim Allen Institute for AI beenkim@csail.mit.edu Rajiv Khanna UT Austin rajivak@utexas.edu Oluwasanmi Koyejo UIUC sanmi@illinois.edu
Pseudocode Yes Algorithm 1 Greedy algorithm, max F(S) s.t. |S| m
Open Source Code No The paper does not contain an explicit statement about releasing the source code for the MMD-critic implementation or a link to a code repository.
Open Datasets Yes We present results for the proposed technique MMD-critic using USPS hand written digits (Hull, 1994) and Imagenet (Deng et al., 2009) datasets. The USPS hand written digits dataset Hull (1994) consists of n = 7291 training (and 2007 test) greyscale images of 10 handwritten digits from 0 to 9.
Dataset Splits Yes The kernel hyperparameter γ was chosen based to maximize the average cross-validated classification performance, then fixed for all other experiments.
Hardware Specification No The paper does not specify the hardware used for running the experiments (e.g., GPU models, CPU types, memory). It only mentions general concepts like 'image embeddings'.
Software Dependencies No The paper mentions using 'radial basis function (RBF) kernel' but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions) that would be needed for reproducibility.
Experiment Setup Yes The kernel hyperparameter γ was chosen based to maximize the average cross-validated classification performance, then fixed for all other experiments.