reproducibilityindex.ai

Balance, Imbalance, and Rebalance: Understanding Robust Overfitting from a Minimax Game Perspective

Authors: Yifei Wang, Liangchen Li, Jiansheng Yang, Zhouchen Lin, Yisen Wang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate this understanding with extensive experiments, and provide a holistic view of robust overfitting from the dynamics of both the two game players. This understanding further inspires us to alleviate robust overfitting by rebalancing the two players by either regularizing the trainer s capacity or improving the attack strength. Experiments show that the proposed Re Balanced Adversarial Training (Re BAT) can attain good robustness and does not suffer from robust overfitting even after very long training.
Researcher Affiliation	Academia	1 School of Mathematical Sciences, Peking University 2 Department of Engineering, University of Cambridge 3 National Key Lab of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University 4 Institute for Artificial Intelligence, Peking University 5 Peng Cheng Laboratory
Pseudocode	No	The paper describes methods textually and with equations, but does not provide any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	Yes	Code is available at https://github. com/PKU-ML/Re BAT.
Open Datasets	Yes	We consider the classification tasks on CIFAR-10, CIFAR-100 [23], and Tiny-Image Net [10] with the Pre Act Res Net-18 [17] and Wide Res Net-34-10 [57] architectures. The dataset contains 60,000 32 32 RGB images from 10 classes. For each class, there are 5,000 images for training and 1,000 images for evaluation.
Dataset Splits	Yes	Following Rice et al. [38], we hold out 1,000 images from the original CIFAR-10/100 training set, and similarly 2,000 images from the original Tiny-Image Net training set as validation sets.
Hardware Specification	Yes	Each model included in this paper is trained on a single NVIDIA Ge Force RTX 3090 GPU.
Software Dependencies	No	The paper mentions using Python and related libraries but does not specify version numbers for libraries like PyTorch or TensorFlow, nor CUDA versions.
Experiment Setup	Yes	We use PGD-10 attack [30] with step size α = 2/255 and perturbation norm ε = 8/255 to craft adversarial examples on-the-fly. Following the settings in Madry et al. [30], we use SGD optimizer with momentum 0.9, weight decay 5 10 4 and batch size 128 to train the model for as many as 1,000 epochs. The learning rate (LR) is initially set to be 0.1 and decays to 0.01 at epoch 100 and further decays to 0.001 at epoch 150.