Curriculum Adversarial Training
Authors: Qi-Zhi Cai, Chang Liu, Dawn Song
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on CIFAR-10 and SVHN [Netzer et al., 2011], and compare our approach against the stateof-the-art approach from [Madry et al., 2018]. We observe that our approach can consistently improve the empirical worst-case accuracy: from 46.18% to 69.27% on CIFAR10, and from 40.38% to 75.66% on SVHN. Also, on nonadversarial test data, the accuracy of models trained using our approach decreases the performance of the state-of-theart models by at most 6%. Therefore, CAT has potentials to be deployed in practice to achieve a robust model. |
| Researcher Affiliation | Academia | 1 Nanjing University 2 UC Berkeley |
| Pseudocode | Yes | Algorithm 1 Adversarial Training (AT(D, N, η, G)) Input: Training data D; Total iterations N; Learning rate η Input: An attack G Output: θ 1: Randomly initialize network θ 2: for i 0 to N do 3: Sample a batch (xi, yi) D 4: Generate adversarial examples x i G(xi, yi) 5: θ θ η P i θL(fθ(x i ), yi) // This is standard SGD; it can be replaced by other training algorithms such as Adam 6: end for |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | Yes | We evaluate our approach on CIFAR-10 and SVHN [Netzer et al., 2011] |
| Dataset Splits | Yes | Input: Training data D; Validation data V; Epoch iterations n; Learning rate η; Maximal attack strength K; Input: An class of attacks, denoted as A(k) whose strength is parameterized by k. Output: θ 1: Randomly initialize network θ 2: for l 0 to K do 3: repeat 4: θ AT(D, n, η, A(l)) 5: // One epoch of adversarial training using A(l) 6: until l-accuracy on V not increased for 10 epochs 7: end for |
| Hardware Specification | No | The paper mentions neural network architectures (Res Net-50, Dense Net-161) but does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific versions for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | We set the hyper-parameters for different data sets to be the same as used in the literature: Dataset Bound K used in CIFAR-10 8/255 7 [Madry et al., 2018] SVHN 12/255 10 [Buckman et al., 2018] For both datasets, we use two state-of-the-art image classification architectures: Res Net-50 [He et al., 2016] and Dense Net-161 [Huang et al., 2017]. Mini-batch size is set to be 200 for all our approaches. |