Enhancing Adversarial Defense by k-Winners-Take-All
Authors: Chang Xiao, Peilin Zhong, Changxi Zheng
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test k-WTA activation on various network structures optimized by a training method, be it adversarial training or not. In all cases, the robustness of k-WTA networks outperforms that of traditional networks under white-box attacks. We conducted extensive experiments on multiple datasets under different network architectures, including Res Net (He et al., 2016), Dense Net (Huang et al., 2017), and Wide Res Net (Zagoruyko & Komodakis, 2016), that are optimized by regular training as well as various adversarial training methods (Madry et al., 2017; Zhang et al., 2019; Shafahi et al., 2019b). |
| Researcher Affiliation | Academia | Chang Xiao Peilin Zhong Changxi Zheng Columbia University {chang, peilin, cxz}@cs.columbia.edu |
| Pseudocode | No | The paper describes the k-WTA activation function and its training procedure in narrative text and mathematical formulas (Equation 1), but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | To promote reproducible research, we will release our implementation of k-WTA networks, along with our experiment code, configuration files and pre-trained models1. 1https://github.com/a554b554/k WTA-Activation |
| Open Datasets | Yes | We conducted extensive experiments on multiple datasets under different network architectures, including Res Net (He et al., 2016), Dense Net (Huang etm al., 2017), and Wide Res Net (Zagoruyko & Komodakis, 2016), that are optimized by regular training as well as various adversarial training methods (Madry et al., 2017; Zhang et al., 2019; Shafahi et al., 2019b). In each setup, we compare the robust accuracy of k-WTA networks with standard Re LU networks on three datasets, CIFAR-10, SVHN, and MNIST. |
| Dataset Splits | No | The paper uses CIFAR-10 and SVHN datasets which have standard train/test splits, but it does not explicitly provide specific details for a separate validation split, such as percentages, sample counts, or a citation for a predefined validation split. |
| Hardware Specification | No | The paper states "All experiments are conducted using Py Torch framework." but does not provide any specific hardware details such as GPU or CPU models, memory, or specific computing environments used for the experiments. |
| Software Dependencies | No | The paper mentions "Py Torch framework" and "Foolbox (Rauber et al., 2017)" but does not specify version numbers for these software dependencies. |
| Experiment Setup | Yes | All the Re LU networks are trained with stochastic gradient descent (SGD) method with momentum=0.9. We use a learning rate 0.1 from the first to 50-th epoch and 0.01 from 50-th to 80-th epoch. All networks are trained with a batch size of 256. For PGD attack, we use 40 iterations with random start, the step size is 0.003. |