Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck
Authors: Junho Kim, Byung-Kwan Lee, Yong Man Ro
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through comprehensive experiments, we demonstrate that the distilled features are highly correlated with adversarial prediction, and they have human-perceptible semantic information by themselves. |
| Researcher Affiliation | Academia | School of Electrical Engineering Korea Advanced Institute of Science and Technology (KAIST) {arkimjh, leebk, ymro}@kaist.ac.kr |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. Figure 1 presents a diagram of the bottleneck concept, but it is not pseudocode. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | Table 1: Classification accuracy of model performance attacked by FGSM [2], PGD [7], and CW [4] on VGG-16 [29] and WRN-28-10 [30], adversarially trained with γ = 0.03 for CIFAR-10, SVHN, and Tiny-Image Net. and publicly available datasets [32, 33, 34]. |
| Dataset Splits | No | The paper uses standard datasets like CIFAR-10, SVHN, and Tiny-Image Net, but does not explicitly state the specific training, validation, and test dataset splits (e.g., percentages or sample counts) used for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, memory, or specific computing environments used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies or version numbers (e.g., programming language versions, library versions) required to replicate the experiments. |
| Experiment Setup | Yes | In this paper, we adversarially train the model f on γ = 0.03 for the standard adversarial attack. |