Strong and Precise Modulation of Human Percepts via Robustified ANNs

Authors: Guy Gaziv, Michael Lee, James J DiCarlo

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Consistent with this, we show that when small-norm image perturbations are generated by standard ANN models, human object category percepts are indeed highly stable. However, in this very same human-presumed-stable regime, we find that robustified ANNs reliably discover low-norm image perturbations that strongly disrupt human percepts. To study the effects of small image perturbations on human visual object categorization, we used a two-stage methodology: (i) generate small image perturbations predicted to modulate human behavior by highly-ranked models of the ventral visual stream and control models, then (ii) collect human object categorization reports in a nine-way choice task, identical to that performed by the model using Amazon Mechanical Turk surveys.
Researcher Affiliation Academia Guy Gaziv Michael J. Lee James J. Di Carlo Mc Govern Institute for Brain Research, Dept. of Brain and Cognitive Sciences Massachusetts Institute of Technology
Pseudocode No The paper describes the Projected Gradient Descent (PGD) attack algorithm using mathematical equations and text, but does not include a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes Code Webpage
Open Datasets Yes Dataset. We sought to focus on a dataset with the following key properties: (i) Has a tractable class-space for exploration; (ii) Will mitigate confounds originating from typical-human unfamiliarity with the class labels (as is commonly in the case in Image Net); (iii) Is widely used in the adversarial robustness context. Based on these considerations, we found a partial and mapped version of Image Net to a basic set of nine classes, termed Restricted Image Net , to be the most suitable choice (see Supplementary Material for class mapping) [32]. To adversarially-train models on Image Net [38]
Dataset Splits No The paper mentions 'Restricted Image Net validation set' in the context of warmup/reference trials, but it does not provide specific percentages or counts for training, validation, and test splits for model training, nor does it reference predefined splits with explicit citations for reproducibility.
Hardware Specification Yes Runtime. Our stimuli generation completes within 5min for a single batch of size 100 on a single A100 GPU. Adversarial-training of models completes within 8 days on 4 A100 GPUs.
Software Dependencies Yes We used the Projected Gradient Descent (PGD) attack algorithm [31, 36, 37], using an adapted version of the robustness library [32]. Logan Engstrom, Andrew Ilyas, Robustness (Python Library), 2019.
Experiment Setup Yes All original input images are resized to 256 x 256 (bilinear, anti-aliased) and center-cropped to 224 x 224 [22]. We focused on the following range of ℓ2-pixel budgets in the low-budget regime: [0.1, 0.5, 1.0, 2.0, 3.0, 5.0, 7.5, 10, 15, 20, 25, 30, 40, 50]. We set the number of PGD steps and the step-size, (ksteps, η), such to match the pixel budget [32], ranging from (200, 0.02) for ϵ = 0.1 to (2000, 2) for ϵ = 50. Specifically, we trained ResNet50 models at ℓ2 pixel budgets of 1.0, 3.0, and 10.0, using PGD (steps, step-size) of (7, 0.3), (7, 0.5), (10, 1.5), respectively. In each trial a single test image randomly-chosen from the full set of TM or DM perturbations images was presented at the center of gaze for a fixed duration of 200ms, after which the rater was shown a category choice screen with nine category options (dog, cat, frog, turtle, bird, primate, fish, crab, insect). No pre-mask or post-mask was used.