On the Use of Anchoring for Training Vision Models
Authors: Vivek Sivaraman Narayanaswamy, Kowshik Thopalli, Rushil Anirudh, Yamen Mubarka, Wesam Sakla, Jay Thiagarajan
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate our proposed approach across datasets and architectures of varying scales and complexities, demonstrating substantial performance gains in generalization and safety metrics compared to the standard training protocol. |
| Researcher Affiliation | Collaboration | Vivek Narayanaswamy Lawrence Livermore National Laboratory narayanaswam1@llnl.gov Kowshik Thopalli Lawrence Livermore National Laboratory thopalli1@llnl.gov Rushil Anirudh Amazon rushil15anirudh@gmail.com Yamen Mubarka Lawrence Livermore National Laboratory mubarka1@llnl.gov Wesam Sakla Lawrence Livermore National Laboratory sakla1@llnl.gov Jayaraman J. Thiagarajan Lawrence Livermore National Laboratory jjthiagarajan@gmail.com |
| Pseudocode | Yes | Figure 3: Py Torch style pseudo code for our proposed approach. |
| Open Source Code | Yes | The open-source code is available at https://software.llnl.gov/anchoring |
| Open Datasets | Yes | CIFAR-10 and (ii) CIFAR-100 [13] datasets contain 50, 000 training samples and 10, 000 test samples each of size 32 × 32 belonging to 10 and 100 classes, respectively; (iii) Image Net-1K [14] is a large-scale vision benchmark comprising 1.3 million training images and 50, 000 validation images across 1000 diverse categories. |
| Dataset Splits | Yes | Image Net-1K [14] is a large-scale vision benchmark comprising 1.3 million training images and 50, 000 validation images across 1000 diverse categories. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It mentions using 'high-capacity architectures' but no specific hardware. |
| Software Dependencies | No | The paper mentions 'PyTorch style pseudo code' and references 'https://pytorch.org/vision', implying the use of PyTorch, but it does not specify any software components with version numbers (e.g., PyTorch 1.x, Python 3.x). |
| Experiment Setup | Yes | Choice of α. Through extensive empirical studies with multiple architectures, we found using the masking schedule hyper-parameter α = 0.2 (corresponds to every 5th batch in an epoch), leads to stable convergence (closely match the top-1 validation accuracy of standard training) on Image Net and α = 0.25 for CIFAR10/100. Note that, our approach performs reference masking for an entire batch as determined by α. We have included our analysis on the impact of choice of α in Section A.1. Table 5 outlines the recipes (augmentations, epochs, optimizers) leveraged for model training. |