Towards the first adversarially robust neural network model on MNIST

Authors: Lukas Schott, Jonas Rauber, Matthias Bethge, Wieland Brendel

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We derive bounds on the robustness and go to great length to empirically evaluate our model using maximally effective adversarial attacks by (a) applying decision-based, score-based, gradient-based and transfer-based attacks for several different Lp norms, (b) by designing a new attack that exploits the structure of our defended model and (c) by devising a novel decision-based attack that seeks to minimize the number of perturbed pixels (L0). The results suggest that our approach yields state-of-the-art robustness on MNIST against L0, L2 and L perturbations
Researcher Affiliation Academia Lukas Schott1-3 , Jonas Rauber1-3 , Matthias Bethge1,3,4 & Wieland Brendel1,3 1Centre for Integrative Neuroscience, University of Tübingen 2International Max Planck Research School for Intelligent Systems 3Bernstein Center for Computational Neuroscience Tübingen 4Max Planck Institute for Biological Cybernetics
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes We will release the model architecture and trained weights as a friendly invitation to fellow researchers to evaluate our model independently.
Open Datasets Yes TOWARDS THE FIRST ADVERSARIALLY ROBUST NEURAL NETWORK MODEL ON MNIST
Dataset Splits No The paper mentions training and testing but does not explicitly provide specific validation dataset split information (exact percentages, sample counts, or detailed splitting methodology).
Hardware Specification No No specific hardware details (like GPU/CPU models, processor types, or memory amounts) used for running the experiments are provided.
Software Dependencies No The paper mentions 'Foolbox v1.3 (Rauber et al., 2017)' and various algorithms but does not provide a comprehensive list of software dependencies with specific version numbers (e.g., Python, PyTorch/TensorFlow, CUDA).
Experiment Setup Yes Hyperparameters and training details for the ABS model The binary ABS and ABS have the same weights and architecture: The encoder has 4 layers with kernel sizes= [5, 4, 3, 5], strides= [1, 2, 2, 1] and feature map sizes= [32, 32, 64, 2 8]. The first 3 layers have ELU activation functions (Clevert et al., 2015), the last layer is linear. All except the last layer use Batch Normalization (Ioffe & Szegedy, 2015). The Decoder architecture has also 4 layers with kernel sizes= [4, 5, 5, 3], strides= [1, 2, 2, 1] and feature map sizes= [32, 16, 16, 1]. The first 3 layers have ELU activation functions, the last layer has a sigmoid activation function, and all layers except the last one use Batch Normalization. We trained the VAEs with the Adam optimizer (Kingma & Ba, 2014). We tuned the dimension L of the latent space of the class-conditional VAEs (ending up with L = 8) to achieve 99% test error; started with a high weight for the KL-divergence term at the beginning of training (which was gradually decreased from a factor of 10 to 1 over 50 epochs); estimated the weighting γ = [1, 0.96, 1.001, 1.06, 0.98, 0.96, 1.03, 1, 1, 1] of the lower bound via a line search on the training accuracy. The parameters maximizing the test cross entropy3 and providing a median confidence of p(y|x) = 0.9 for our modified softmax (equation 8) are η = 0.000039 and α = 440.