Don’t trust your eyes: on the (un)reliability of feature visualizations

Authors: Robert Geirhos, Roland S. Zimmermann, Blair Bilodeau, Wieland Brendel, Been Kim

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We start our investigation by developing network circuits that trick feature visualizations into showing arbitrary patterns that are completely disconnected from normal network behavior on natural input. We then provide evidence for a similar phenomenon occurring in standard, unmanipulated networks: feature visualizations are processed very differently from standard input, casting doubt on their ability to explain how neural networks process natural images. This can be used as a sanity check for feature visualizations. We underpin our empirical findings by theory proving that the set of functions that can be reliably understood by feature visualization is extremely small and does not include general black-box neural networks.
Researcher Affiliation Collaboration 1Google Deep Mind 2Max Planck Institute for Intelligent Systems 3T ubingen AI Center 4Department of Statistical Sciences, University of Toronto.
Pseudocode No The paper does not contain any explicitly labeled “Pseudocode” or “Algorithm” blocks or figures.
Open Source Code Yes Code to replicate experiments from this paper is available here: https://github.com/google-research/ fooling-feature-visualizations/
Open Datasets Yes For classifier training, we create a dataset by combining 1, 281, 167 images from the training set of the Image Net 2012 dataset and 472, 500 synthetic images.
Dataset Splits Yes This can be verified by checking the network s validation accuracy on Image Net-1K, which only minimally changes when deceiving all visualizations in the last layer of Inception-V1 (top-1 accuracy changes from 69.146 % to 68.744 %; top-5 from 88.858 % to 88.330 %).; natural Image Net validation images
Hardware Specification No No specific hardware details (like GPU models, CPU types, or memory specifications) used for running experiments are provided in the paper. The paper states: “There are no special compute requirements (e.g., we do not train large models).”
Software Dependencies Yes Throughout the paper, feature visualizations were generated using the lucent library (Greentfrapp, v0.1.8), version v0.1.8.
Experiment Setup Yes We train a model implementing the simple six layer CNN architecture displayed in Table 3 for 8 epochs on the aforementioned dataset with an SGD optimizer using a learning rate of 0.01, momentum of 0.9 and weight decay of 0.00005.; For Figure 1, we used the thresholds that visually looked best... specifically thresholds=(512,512,512,6,32,6) for the six visualizations from left to right... transforms=lucent.optvis.transform.standard_transforms + [center_crop(224, 224)] was used. The image was parameterized via param_f=lambda: lucent.optvis.param.image(224, batch=1).