reproducibilityindex.ai

Efficient local linearity regularization to overcome catastrophic overfitting

Authors: Elias Abad Rocamora, Fanghui Liu, Grigorios Chrysos, Pablo M. Olmos, Volkan Cevher

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our thorough experimental validation demonstrates that our work does not suffer from CO, even in challenging settings where previous works suffer from it. We also notice that adapting our regularization parameter during training (ELLE-A) greatly improves the performance, specially in large ϵ setups.
Researcher Affiliation	Academia	1LIONS École Polytechnique Fédérale de Lausanne, 2University of Warwick, 3University of Wisconsin-Madison, 4Universidad Carlos III de Madrid
Pseudocode	Yes	Algorithm 1 ELLE (ELLE-A) adversarial training. Pseudo-code in teal is only run for ELLE-A.
Open Source Code	Yes	Our implementation is available in https://github.com/LIONS-EPFL/ELLE.
Open Datasets	Yes	We train the architectures Pre Act Res Net18 (PRN), Res Net50 (He et al., 2016) and Wide Res Net-28-10 (WRN) (Zagoruyko and Komodakis, 2016) in CIFAR10/100 (Krizhevsky, 2009), SVHN (Netzer et al., 2011) and Image Net (Deng et al., 2009).
Dataset Splits	Yes	We evaluate the PGD-20 adversarial accuracy in a 1024-image validation sample extracted from the training set of each dataset.
Hardware Specification	Yes	Image Net experiments were conducted in a single machine with an NVIDIA A100 SXM4 80GB GPU. For the rest of experiments we used a single machine with an NVIDIA A100 SXM4 40 GB GPU.
Software Dependencies	No	The paper does not explicitly list software dependencies with specific version numbers (e.g., Python version, specific deep learning framework version like PyTorch or TensorFlow).
Experiment Setup	Yes	We use the SGD optimizer with momentum 0.9 and weight decay 5 × 10−4. Short: From Andriushchenko and Flammarion (2020), with 30 and 15 epochs for CIFAR10/100 and SVHN respectively, batch size of 128 and a cyclic learning rate schedule with a maximum learning rate of 0.2. Long: From Rice et al. (2020), with 200 epochs, batch size of 128, a constant learning rate of 0.1 for CIFAR10/100 and 0.01 for SVHN, decayed by a factor of 10 at epochs 100 and 150.