Improving Robustness with Adaptive Weight Decay
Authors: Mohammad Amin Ghiasi, Ali Shafahi, Reza Ardekani
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through rigorous experimentation, we demonstrate that AWD consistently yields enhanced robustness. By conducting experiments on diverse datasets and architectures, we provide empirical evidence to showcase the effectiveness of our approach in mitigating robust overfitting. |
| Researcher Affiliation | Industry | Amin Ghiasi, Ali Shafahi, Reza Ardekani Apple Cupertino, CA, 95014 {mghiasi2, ashafahi, rardekani} @apple.com |
| Pseudocode | Yes | Algorithm 1 Adaptive Weight Decay 1: Input: λawd > 0 2: λ 0 3: for (x, y) 2 loader do 4: p model(x) . Get models prediction. 5: main Cross Entropy(p, y) . Compute Cross Entropy. 6: rw backward(main) . Compute the gradients of main loss w.r.t weights. 7: λ krwkλawd kwk . Compute iteration s weight decay hyperparameter. 8: λ 0.1 λ + 0.9 stop_gradient(λ) . Compute the weighted average as a scalar. 9: w w lr(rw + λ w) . Update Network s parameters. 10: end for |
| Open Source Code | No | The paper does not contain any explicit statement about making the source code available or provide a link to a code repository. |
| Open Datasets | Yes | We focus on six datasets: SVHN, Fashion MNIST, Flowers, CIFAR-10, CIFAR-100, and Tiny Image Net. |
| Dataset Splits | Yes | We reserve 10% of the training examples as a held-out validation set for early stopping and checkpoint selection. |
| Hardware Specification | No | The paper does not specify the hardware used for experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | The paper does not provide specific software dependencies or their version numbers, such as Python or PyTorch versions. |
| Experiment Setup | Yes | We train for 200 epochs, using an initial learning-rate of 0.1 combined with a cosine learning-rate schedule. |