Mixture of Adversarial LoRAs: Boosting Robust Generalization in Meta-Tuning
Authors: Xu Yang, Chen Liu, Ying Wei
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive evaluations demonstrate that AMT yields significant improvements, up to 12.92% in clean generalization and up to 49.72% in adversarial generalization, over previous state-of-the-art methods across a diverse range of OOD few-shot image classification tasks on three benchmarks, confirming the effectiveness of our approach to boost the robust generalization of pre-trained models. |
| Researcher Affiliation | Academia | 1 City University of Hong Kong 2 Zhejiang University |
| Pseudocode | Yes | Algorithm 1 shows our adversarial meta-tuning pipeline. ... Algorithm 2 in the Appendix A shows our test-time merging algorithm pipeline. |
| Open Source Code | Yes | Our code is available at https://github.com/xyang583/AMT. |
| Open Datasets | Yes | We evaluate AMT using the large-scale cross-domain few-shot classification benchmarks Meta-Dataset [16], BSCD-FSL [29] and fine-grained datasets [30]. |
| Dataset Splits | Yes | We sample five tasks from each domain as the validation set for hyperparameter selection. |
| Hardware Specification | Yes | The experiments were conducted on one NVIDIA A6000 GPU. |
| Software Dependencies | No | The paper mentions software components like DINO pre-training checkpoint, SGD optimizer, and Vision Transformer, but does not specify their version numbers (e.g., PyTorch version, CUDA version). |
| Experiment Setup | Yes | The SGD optimizer with a momentum of 0.9 and a cosine-decayed learning rate η2 starting at 5 10 4 are adopted. Training is conducted for a maximum of 30 epochs, with a 5-epoch warming-up stage. The loss trade-off coefficient λadv is set to 6. The input image size is 128 128 as per PMF [12]. ... We use a pool of size P = 4 and a Lo RA rank of r = 8, choosing the top 2 from the pool for merging at test time. ... The adversarial query set is generated using untargeted weak and strong patch perturbations [74] with l -bounded budgets ϵ {0.01/255, 0.1/255, 6/255, 8/255} in 2 steps, and a step size of α ϵ 10 . The size of the neighborhood η1 is set to 1e 4 for adversarial perturbation on singular values and vectors. We search domain-wise hyperparameters on the validation set, including λ in the range of [0, 1], β in the range of [1, 12], and ρ in the range of [0, 1]. |