VeRA: Vector-based Random Matrix Adaptation
Authors: Dawid Jan Kopiczko, Tijmen Blankevoort, Yuki M Asano
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct a series of experiments to evaluate our finetuning method. We start by comparing our approach to Lo RA and other baselines on the GLUE and E2E benchmarks. Following this, we turn our attention to instruction-tuning of Llama models, and image classification with Vision Transformers. Next, we select one task and vary the rank for both methods, Lo RA and Ve RA, to examine how performance scales with the number of trainable parameters. Lastly, an ablation study sheds light on the importance of each component in our method, including the influence of different initializations. |
| Researcher Affiliation | Collaboration | Dawid J. Kopiczko QUVA Lab University of Amsterdam Tijmen Blankevoort Qualcomm AI Research Yuki M. Asano QUVA Lab University of Amsterdam |
| Pseudocode | No | The paper describes its method formulation using mathematical equations and diagrams (Figure 1), but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions a website: '1Website: https://dkopi.github.io/vera/'. This is a project website (GitHub Pages) and not a direct link to a source-code repository for the methodology described in the paper. There is no explicit statement of code release or specific repository link provided for the code itself. |
| Open Datasets | Yes | We evaluate our approach on the General Language Understanding Evaluation (GLUE) benchmark (Wang et al., 2019), employing the Ro BERTabase and Ro BERTalarge models (Liu et al., 2019). For the E2E benchmark (Novikova et al., 2017), we follow the experimental setup from Hu et al. (2022) and finetune the GPT-2 (Radford et al., 2019) Medium and Large models. (...) We employ the Alpaca dataset (Taori et al., 2023), specifically its cleaned version4. (...) To evaluate the method on the image classification task, we adapt Vision Transformer (Vi T) (Dosovitskiy et al., 2021), Base and Large variants, on datasets CIFAR100 (Krizhevsky, 2009), Food101 (Bossard et al., 2014), Flowers102 (Nilsback & Zisserman, 2008), and RESISC45 (Cheng et al., 2017). |
| Dataset Splits | Yes | We perform 5 runs with different random seeds, recording the best epoch s outcome for each run, and report the median of these results. (...) For each dataset we train on a subset of 10 samples per class, and evaluate on the full test set (CIFAR100, Food101, Flowers102) or on all the remaining samples (RESISC45). |
| Hardware Specification | No | The paper mentions using 'a single GPU' and 'National Supercomputer Snellius and Distributed ASCI Supercomputer 6 (Bal et al., 2016)', but does not provide specific models or detailed specifications for the GPUs, CPUs, or other hardware components used for the experiments. 'Single GPU' is not specific enough. |
| Software Dependencies | No | The paper mentions software like 'Py Torch (Paszke et al., 2019)' and 'Hugging Face PEFT (Mangrulkar et al., 2022)' but does not provide specific version numbers for these or any other ancillary software components. |
| Experiment Setup | Yes | We determine the learning rates and the number of training epochs through hyperparameter tuning; for detailed settings, refer to the Table 8 in Appendix A. (...) Table 8: Hyperparameter configurations for different model sizes on GLUE benchmark. Optimizer, Warmup Ratio, and LR Schedule are taken from Hu et al. (2022) (...) Table 9: Hyperparameter configurations for instruction-tuning. (...) Table 10: Hyperparameter configurations for Ve RA on the E2E benchmark (...) Table 11: Hyperparameter configurations for Ve RA and Lo RA for finetuning Vi T on the image classification datasets. |