reproducibilityindex.ai

Why Do Artificially Generated Data Help Adversarial Robustness

Authors: Yue Xing, Qifan Song, Guang Cheng

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical studies are conducted to verify our theories and show the effectiveness of the proposed algorithm. ... In Appendix B, we aim to use simple models to numerically verify: (1) given the ideal data generator, the performance of e ( ) is better than b ( ) when deviates from zero; (2) the better quality of the data generator implies the better performance of e ( ); and (3) balancing the weight between S1 and S2 improves the performance. We observe all (1) to (3) in the simulations. In Appendix C, we aim to verify that the label cost and the generator cost are important factors in deep learning. We aim to show (1) adding more unlabeled samples from the ideal generator will improve adversarial robustness, and (2) adding unlabeled samples from a poor generator with a small n2 will slightly improve the performance. We perform an experiment on the CIFAR-10 data set. ... In the experiment, we consider binary classiﬁcation for the CIFAR-10 dataset to classify airplane and car. ... We repeat the experiment for 10 times to get the average robust testing accuracy and its standard error. The results are summarized in Table 1.
Researcher Affiliation	Academia	Yue Xing Department of Statistics Purdue University xing49@purdue.edu Qifan Song Department of Statistics Purdue University qfsong@purdue.edu Guang Cheng Department of Statistics University of California, Los Angeles guangcheng@ucla.edu
Pseudocode	Yes	Algorithm 1 Select during Training
Open Source Code	Yes	We are using the code of Rice et al. (2020) to do experiments, and we also provide the code for CIFAR-10 experiment with data from Gowal et al. (2021).
Open Datasets	Yes	We perform an experiment on the CIFAR-10 data set.
Dataset Splits	No	The paper mentions taking '500 samples from each class as labeled data, i.e., n1 = 1, 000' and mentions using 'unlabeled data', but does not explicitly specify the training, validation, and test dataset splits by percentage or sample count, or refer to standard predefined splits for the entire dataset used.
Hardware Specification	No	The paper states 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] In Section C.' However, the provided text does not include Section C, thus no specific hardware details are present in the given content.
Software Dependencies	No	The paper states 'The implementation for all real-data experiments is modiﬁed from Rice et al. (2020),' but it does not specify any software libraries or dependencies with version numbers.
Experiment Setup	Yes	In the experiment, we consider binary classiﬁcation for the CIFAR-10 dataset... We take 500 samples from each class as labeled data, i.e., n1 = 1, 000. To form S2, we sample n2/2 data from the other 9,000 airplane and car pictures, and n2/2 data from other classes... To control over-ﬁtting problem, we tune w during the ﬁrst 20% iterations in the experiment. The experiment setups are postponed to Appendix C.