Boundary thickness and robustness in learning models
Authors: Yaoqing Yang, Rajiv Khanna, Yaodong Yu, Amir Gholami, Kurt Keutzer, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that a thicker boundary helps improve robustness against adversarial examples... and we show that many commonly-used regularization and data augmentation procedures can increase boundary thickness. We demonstrate empirically that a thin decision boundary leads to poor adversarial robustness as well as poor OOD robustness (Section 3), and we evaluate the effect of model adjustments that affect boundary thickness. |
| Researcher Affiliation | Academia | Yaoqing Yang, Rajiv Khanna, Yaodong Yu, Amir Gholami, Kurt Keutzer, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney University of California, Berkeley Berkeley, CA 94720 {yqyang, rajivak, yaodong_yu, amirgh, keutzer, jegonzal, kannanr, mahoneymw}@berkeley.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | In order that our results can be reproduced and extended, we have open-sourced our code.1 1https://github.com/nsfzyzz/boundary_thickness |
| Open Datasets | Yes | Here, we compare the boundary thicknesses and robustness of models trained with three different schemes on CIFAR10 [34] |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. It mentions using CIFAR10/CIFAR100, which have standard splits, but these are not explicitly stated in the text. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions using specific neural network architectures (e.g., ResNets, VGGs), but does not provide specific ancillary software details with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | All models are trained with the same initial learning rate of 0.1. At both epoch 100 and 150, we reduce the current learning rate by a factor of 10. In the standard setting, we follow convention and train with learning rate 0.1, weight decay 5e-4, attack range = 8 pixels, 10 iterations for each attack, and 2 pixels for the step-size. We train each model for enough time (400 epochs). |