Learning from brains how to regularize machines

Authors: Zhe Li, Wieland Brendel, Edgar Walker, Erick Cobos, Taliah Muhammad, Jacob Reimer, Matthias Bethge, Fabian Sinz, Zachary Pitkow, Andreas Tolias

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Despite impressive performance on numerous visual tasks, Convolutional Neural Networks (CNNs) unlike brains are often highly sensitive to small perturbations of their input, e.g. adversarial noise leading to erroneous decisions. We propose to regularize CNNs using large-scale neuroscience data to learn more robust neural features in terms of representational similarity. We presented natural images to mice and measured the responses of thousands of neurons from cortical visual areas. Next, we denoised the notoriously variable neural activity using strong predictive models trained on this large corpus of responses from the mouse visual system, and calculated the representational similarity for millions of pairs of images from the model s predictions. We then used the neural representation similarity to regularize CNNs trained on image classification by penalizing intermediate representations that deviated from neural ones. This preserved performance of baseline models when classifying images under standard benchmarks, while maintaining substantially higher performance compared to baseline or control models when classifying noisy images. Moreover, the models regularized with cortical representations also improved model robustness in terms of adversarial attacks.
Researcher Affiliation Academia 1 Department of Neuroscience, Baylor College of Medicine 2 Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine 3 Department of Electrical and Computer Engineering, Rice University 4 Centre for Integrative Neuroscience, University of Tübingen 5 Bernstein Center for Computational Neuroscience, University of Tübingen 6 Institute for Theoretical Physics, University of Tübingen 7 Institute Bioinformatics and Medical Informatics, University of Tübingen
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes A docker image containing all codes and trained models is prepared (zheli18/neural-reg:neurips19), with a jupyter lab as entrypoint.
Open Datasets Yes In each experiment, we measured responses to 5100 different grayscale images sampled from the Image Net dataset" and "In this section, we implement grayscale CIFAR10 classification" and "training Res Net34 models on grayscale CIFAR100 datasets".
Dataset Splits No For the neural data, "100 of which were repeated 10 times to give 6000 trials in total. Each image was downsampled by a factor of four to 64 36 pixels. We call the repeated images oracle images , because the mean neural responses over these repeated trials serve as a high quality predictor (oracle) for validation trials." However, for the CIFAR datasets, while a test set is mentioned, the specific training/validation/test splits (e.g., percentages or counts) are not provided.
Hardware Specification Yes It takes about 4.5 hours on a single TITAN RTX GPU to train one model.
Software Dependencies No We used Py Torch [25] for model training. (Explanation: While PyTorch is mentioned, a specific version number is not provided, which is required for a reproducible description of software dependencies.)
Experiment Setup Yes All models are trained by stochastic gradient descent for 40 epochs with batch size 64. Learning rate starts at 0.1 and decays by 0.3 every 4 epochs, but resets to 0.1 after the 20th epoch.