Mixture of Adversarial LoRAs: Boosting Robust Generalization in Meta-Tuning

Authors: Xu Yang, Chen Liu, Ying Wei

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive evaluations demonstrate that AMT yields significant improvements, up to 12.92% in clean generalization and up to 49.72% in adversarial generalization, over previous state-of-the-art methods across a diverse range of OOD few-shot image classification tasks on three benchmarks, confirming the effectiveness of our approach to boost the robust generalization of pre-trained models.
Researcher Affiliation Academia 1 City University of Hong Kong 2 Zhejiang University
Pseudocode Yes Algorithm 1 shows our adversarial meta-tuning pipeline. ... Algorithm 2 in the Appendix A shows our test-time merging algorithm pipeline.
Open Source Code Yes Our code is available at https://github.com/xyang583/AMT.
Open Datasets Yes We evaluate AMT using the large-scale cross-domain few-shot classification benchmarks Meta-Dataset [16], BSCD-FSL [29] and fine-grained datasets [30].
Dataset Splits Yes We sample five tasks from each domain as the validation set for hyperparameter selection.
Hardware Specification Yes The experiments were conducted on one NVIDIA A6000 GPU.
Software Dependencies No The paper mentions software components like DINO pre-training checkpoint, SGD optimizer, and Vision Transformer, but does not specify their version numbers (e.g., PyTorch version, CUDA version).
Experiment Setup Yes The SGD optimizer with a momentum of 0.9 and a cosine-decayed learning rate η2 starting at 5 10 4 are adopted. Training is conducted for a maximum of 30 epochs, with a 5-epoch warming-up stage. The loss trade-off coefficient λadv is set to 6. The input image size is 128 128 as per PMF [12]. ... We use a pool of size P = 4 and a Lo RA rank of r = 8, choosing the top 2 from the pool for merging at test time. ... The adversarial query set is generated using untargeted weak and strong patch perturbations [74] with l -bounded budgets ϵ {0.01/255, 0.1/255, 6/255, 8/255} in 2 steps, and a step size of α ϵ 10 . The size of the neighborhood η1 is set to 1e 4 for adversarial perturbation on singular values and vectors. We search domain-wise hyperparameters on the validation set, including λ in the range of [0, 1], β in the range of [1, 12], and ρ in the range of [0, 1].