Neural Architecture Design and Robustness: A Dataset
Authors: Steffen Jung, Jovita Lukasik, Margret Keuper
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate all these networks on a range of common adversarial attacks and corruption types and introduce a database on neural architecture design and robustness evaluations. We further present three exemplary use cases of this dataset, in which we (i) benchmark robustness measurements based on Jacobian and Hessian matrices for their robustness predictability, (ii) perform neural architecture search on robust accuracies, and (iii) provide an initial analysis of how architectural design choices affect robustness. |
| Researcher Affiliation | Academia | 1 Max Planck Institute for Informatics, Saarland Informatics Campus {steffen.jung,jlukasik,keuper}@mpi-inf.mpg.de 2 University of Siegen |
| Pseudocode | Yes | Algorithm 1: Robustness Dataset Gathering |
| Open Source Code | Yes | Code and data is available at http://robustness.vision/. |
| Open Datasets | Yes | Each architecture is trained on three different image datasets for 200 epochs: CIFAR-10 (Krizhevsky, 2009), CIFAR-100 (Krizhevsky, 2009) and Image Net16-120 (Chrabaszcz et al., 2017). |
| Dataset Splits | No | The paper explicitly mentions using 'test splits' but does not specify details about validation splits for reproducing experiments. |
| Hardware Specification | Yes | These clusters are comprised of either (i) compute nodes with Nvidia A100 GPUs, 512 GB RAM, and Intel Xeon Ice Lake-SP processors, (ii) compute nodes with NVIDIA Quadro RTX 8000 GPUs, 1024 GB RAM, and AMD EPYC 7502P processors, (iii) NVIDIA Tesla A100 GPUs, 2048 GB RAM, Intel Xeon Platinum 8360Y processors, and (iv) NVIDIA Tesla A40 GPUs, 2048 GB RAM, Intel Xeon Platinum 8360Y processors. |
| Software Dependencies | No | The paper mentions tools like "Foolbox" and methods like "Chatzimichailidis et al., 2019" for computation, but does not provide specific version numbers for these software components or other dependencies. |
| Experiment Setup | Yes | In the case of architectures trained for NAS-Bench-201, this is cross entropy (CE). Since attacks via FGSM can be evaluated fairly efficiently, we evaluate all architectures for ϵ EF GSM = {.1, .5, 1, 2, . . . , 8, 255}/255, so for a total of |EF GSM| = 11 times for each architecture. We use Foolbox (Rauber et al., 2017) to perform the attacks, and collect (a) accuracy, (b) average prediction confidences, as well as (c) confusion matrices for each network and ϵ combination. ... Therefore, we find it sufficient to evaluate PGD for ϵ EP GD = {.1, .5, 1, 2, 3, 4, 8}/255, so for a total of |EP GD| = 7 times for each architecture. As for FGSM, we use Foolbox (Rauber et al., 2017) to perform the attacks using their L PGD implementation and keep the default settings, which are α = 0.01/0.3 for 40 attack iterations. ... We kept the default number of attack iterations that is 100. ... We kept the default number of search iterations at 5 000. |