reproducibilityindex.ai

Self-Progressing Robust Training

Authors: Minhao Cheng, Pin-Yu Chen, Sijia Liu, Shiyu Chang, Cho-Jui Hsieh, Payel Das7107-7115

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Compared with state-of-the-art adversarial training methods (PGD-L-inf and TRADES) under L-infnorm bounded attacks and various invariance tests, SPROUT consistently attains superior performance and is more scalable to large neural networks. We evaluate the multi-dimensional performance of different training methods on (wide) Res Net and VGG networks using CIFAR-10 and Image Net datasets.
Researcher Affiliation	Collaboration	Minhao Cheng,1,2 Pin-Yu Chen,2 Sijia Liu,3 Shiyu Chang,2 Cho-Jui Hsieh,1 Payel Das2 1 Department of Computer Science, UCLA 2 IBM Research 3 Department of Computer Science and Engineering, Michigan State University
Pseudocode	Yes	Algorithm 1 SPROUT algorithm Input: Training dataset (X, Y ), Mixup parameter λ, Gaussian augmentation variance 2, model learning rate γθ, Dirichlet label smoothing learning rate γβ and parameter α, generalized cross entropy loss L Initial model θ: random initialization (train from scratch) or pre-trained model checkpoint Initial β: random initialization for epoch=1, . . . , N do for minibatch XB X, YB Y do XB N(XB, 2) Xmix, Ymix Mixup(XB, YB, λ) Ymix Dirichlet(αYmix + (1 α)β) gθ θL(Xmix, Ymix, θ) gβ βL(Xmix, Ymix, θ) θ θ γθgθ β β + γβgβ end for end for return θ
Open Source Code	Yes	Our implementation is publicly available. 1Code available at https://github.com/IBM/SPROUT
Open Datasets	Yes	We use CIFAR-10 and Image Net (Deng et al. 2009) for performance evaluation.
Dataset Splits	No	The paper mentions training and test datasets but does not explicitly provide details about a separate validation set or specific train/validation/test splits by percentage or count needed for reproduction.
Hardware Specification	No	The paper mentions "On our machine" and discusses "computation resources" and "run-time" (Table 6), but it does not provide any specific hardware details such as CPU/GPU models, memory, or detailed cloud instance specifications.
Software Dependencies	No	The paper mentions using "Pytorch implementation" but does not specify the version of PyTorch or any other software dependencies (e.g., Python, CUDA, other libraries) with their version numbers.
Experiment Setup	Yes	As suggested in Mixup (Zhang et al. 2018), we set the Beta distribution parameter a = 0.2 when sampling the mixing parameter λ. For Gaussian augmentation, we set = 0.1, which is within the suggested range in (Zantedeschi, Nicolae, and Rawat 2017). Also, we set the label smoothing parameter α = 0.01. A parameter sensitivity analysis on λ and α is given in Appendix. Unless speciﬁed otherwise, for SPROUT we set the model initialization to be a natural model.