Adversarially trained neural representations are already as robust as biological neural representations

Authors: Chong Guo, Michael Lee, Guillaume Leclerc, Joel Dapello, Yug Rao, Aleksander Madry, James Dicarlo

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we develop a method for performing adversarial visual attacks directly on primate brain activity. We then leverage this method to demonstrate that the above-mentioned belief might not be well founded. Specifically, we report that the biological neurons that make up visual systems of primates exhibit susceptibility to adversarial perturbations that is comparable in magnitude to existing (robustly trained) artificial neural networks.
Researcher Affiliation Academia 1Mc Govern Institute for Brain Research, MIT 2Department of Brain and Cognitive Sciences, MIT 3Center for Brains, Minds and Machines, MIT 4Computer Science and Artificial Intelligence Laboratory, MIT 5School of Engineering and Applied Sciences, Harvard University 6Purdue University 7Department of Electrical Engineering and Computer Science, MIT.
Pseudocode No Figure 4 presents a diagram outlining the experimental pipeline with boxes and arrows, but it is not pseudocode or a formally structured algorithm block.
Open Source Code No The paper does not contain any explicit statement about releasing their source code, nor does it provide a link to a code repository for the methodology described.
Open Datasets Yes Our primary goal was to measure the sensitivity of the response of individual IT sites to worst-case local pixel perturbations of visual stimuli. For each neural site i, we measure its response ri(x) to clean images x D, where D is the Image Net training set (Deng et al., 2009). ... The clean images used on day 0, Xt=0, consist of 1000 clean images sampled from one of each of the 1000 Image Net classes from the clean training set.
Dataset Splits No The paper mentions using subsets of the ImageNet training set for various purposes (e.g., '1000 clean images sampled from one of each of the 1000 Image Net classes' or '12k images from the Image Net training set' for model tuning), but it does not specify explicit training/validation/test splits (e.g., percentages or counts) for the data collected in their own experiments to allow for reproducibility of data partitioning.
Hardware Specification No The paper mentions specific hardware for data collection (e.g., "99-channel Utah arrays... implanted in anterior and central IT (Blackrock Neurotech)"), but it does not specify any hardware details (like GPU/CPU models or memory) used for running the computational models, training, or analysis.
Software Dependencies No The paper mentions software components like "Res Net50", "AT-Res Net50", "AT-Wide Res Net50-4", and refers to "PGD" for optimization and "Robustness (python library)" (Engstrom et al., 2019a). However, it does not provide specific version numbers for any of these software dependencies, programming languages, or libraries.
Experiment Setup Yes We found the best baseline model is an adversarially pre-trained Image Net model AT-Res Net50 (l2ϵ = 2) that is linearly-mapped with channel-factorized weights from layer 4.0 to a 21 dimensional output layer to model the IT neural sites... Using this randomly mapped baseline model, we optimize attack images independently for each model neuron using PGD with random starts, 100 steps and step size=ϵ/3... we perform 250 steps of projected gradient descent first with a ball of radius 2ϵ and finally with one of radius ϵ. The visual stimuli are presented 8 degrees over the visual field for 100ms followed by a 100ms grey mask as in a standard rapid serial visual presentation (RSVP) task.