reproducibilityindex.ai

Fast AdvProp

Authors: Jieru Mei, Yucheng Han, Yutong Bai, Yixiao Zhang, Yingwei Li, Xianhang Li, Alan Yuille, Cihang Xie

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results show that, compared to the vanilla training baseline, Fast Adv Prop is able to further model performance on a spectrum of visual benchmarks, without incurring extra training cost. Additionally, our ablations ﬁnd Fast Adv Prop scales better if larger models are used, is compatible with existing data augmentation methods (i.e., Mixup and Cut Mix), and can be easily adapted to other recognition tasks like object detection.
Researcher Affiliation	Academia	Jieru Mei1, Yucheng Han2, Yutong Bai1, Yixiao Zhang1, Yingwei Li1, Xianhang Li3 Alan Yuille1 & Cihang Xie3 1Johns Hopkins University 2Nanyang Technological University 3UC Santa Cruz
Pseudocode	Yes	Algorithm 1: Pseudo code of Fast Adv Prop for T epochs, given some radius ϵ, importance re-weight parameter β, learning rate γ, and ratio of adversarial examples padv.
Open Source Code	Yes	The code is available here: https://github.com/meijieru/fast_advprop.
Open Datasets	Yes	We evaluate model performance on Image Net classiﬁcation, and the robustness on different specialized benchmarks including Image Net-C, Image Net-R, and Stylized Image Net. Image Net dataset contains 1.2 million training images and 50000 images for validation of 1000 classes. We implement Fast Adv Prop on object detection and evaluate it on COCO dataset (Lin et al., 2014).
Dataset Splits	Yes	Image Net dataset contains 1.2 million training images and 50000 images for validation of 1000 classes.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies	No	The paper mentions using 'a SGD optimizer' and refers to 'Res Net family' architectures, but does not specify any software dependencies (e.g., PyTorch, TensorFlow, or specific library versions) with version numbers.
Experiment Setup	Yes	We use the renowned Res Net family (He et al., 2016) as our default architectures. We use a SGD optimizer with momentum 0.9 and train for 105 epochs. The learning rate starts from 0.1 and decays at 30, 60, 90, 100 epochs by 0.1. We use a batch size of 64 per GPU for vanilla training. For decoupled training setting, we use a batch size of 64/(1 padv) per GPU, keeping the same 64 batch size per GPU for the original BNs. padv is set to 0.2 if not speciﬁed. ... To generate adversarial images, we using the PGD attacker with random initialization. We attack for one step (K = 1) and set the perturbation size to 1.0. ... we set β = 0.5 for halving the importance of the images with random noise and the adversarial images. Additionally, we re-scale the gradient to achieve the 1 : 1 : 1 ratio for ensuring similar updating speed of all parameters.