reproducibilityindex.ai

Improving Robustness with Adaptive Weight Decay

Authors: Mohammad Amin Ghiasi, Ali Shafahi, Reza Ardekani

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through rigorous experimentation, we demonstrate that AWD consistently yields enhanced robustness. By conducting experiments on diverse datasets and architectures, we provide empirical evidence to showcase the effectiveness of our approach in mitigating robust overﬁtting.
Researcher Affiliation	Industry	Amin Ghiasi, Ali Shafahi, Reza Ardekani Apple Cupertino, CA, 95014 {mghiasi2, ashafahi, rardekani} @apple.com
Pseudocode	Yes	Algorithm 1 Adaptive Weight Decay 1: Input: λawd > 0 2: λ 0 3: for (x, y) 2 loader do 4: p model(x) . Get models prediction. 5: main Cross Entropy(p, y) . Compute Cross Entropy. 6: rw backward(main) . Compute the gradients of main loss w.r.t weights. 7: λ krwkλawd kwk . Compute iteration s weight decay hyperparameter. 8: λ 0.1 λ + 0.9 stop_gradient(λ) . Compute the weighted average as a scalar. 9: w w lr(rw + λ w) . Update Network s parameters. 10: end for
Open Source Code	No	The paper does not contain any explicit statement about making the source code available or provide a link to a code repository.
Open Datasets	Yes	We focus on six datasets: SVHN, Fashion MNIST, Flowers, CIFAR-10, CIFAR-100, and Tiny Image Net.
Dataset Splits	Yes	We reserve 10% of the training examples as a held-out validation set for early stopping and checkpoint selection.
Hardware Specification	No	The paper does not specify the hardware used for experiments, such as specific GPU or CPU models.
Software Dependencies	No	The paper does not provide specific software dependencies or their version numbers, such as Python or PyTorch versions.
Experiment Setup	Yes	We train for 200 epochs, using an initial learning-rate of 0.1 combined with a cosine learning-rate schedule.