Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks
Authors: Yang Li, Shaobo Han, Jonathan Shihao Ji
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of VB-Lo RA on natural language understanding, natural language generation, instruction tuning, and mathematical reasoning tasks. |
| Researcher Affiliation | Collaboration | Yang Li Dept. of Computer Science Georgia State University Atlanta, GA 30303 EMAIL; Shaobo Han Optical Networking and Sensing NEC Laboratories America Princeton, NJ 08540 EMAIL; Shihao Ji School of Computing University of Connecticut Storrs, CT 06269 EMAIL |
| Pseudocode | Yes | Algorithm 1 Pseudocode of VB-Lo RA in a Py Torch-like style |
| Open Source Code | Yes | Our source code is available at https://github.com/leo-yangli/VB-Lo RA. |
| Open Datasets | Yes | We adopt the General Language Understanding Evaluation (GLUE) benchmark3 [Wang et al., 2018]; GPT-2 Medium and Large models [Radford et al., 2019] on the E2E dataset4 [Novikova et al., 2017]; Cleaned Alpaca Dataset 5; MT-Bench6 [Zheng et al., 2024]; Meta Math QA8 [Yu et al., 2023] dataset; GSM8K9 [Cobbe et al., 2021]; and MATH10 [Hendrycks et al., 2021] datasets. All are accompanied by citations and/or URLs with licensing information. |
| Dataset Splits | Yes | For natural language generation experiments, we fine-tune the GPT-2 Medium and Large models [Radford et al., 2019] on the E2E dataset4 [Novikova et al., 2017], which contains approximately 42,000 training examples, 4,600 validation examples, and 4,600 test examples from the restaurant domain. We adopt the General Language Understanding Evaluation (GLUE) benchmark3 [Wang et al., 2018] to assess the performance of VB-Lo RA across various natural language understanding tasks. |
| Hardware Specification | Yes | All our experiments were conducted on a server equipped with 8 NVIDIA A100 GPUs. |
| Software Dependencies | No | The paper mentions using a 'Py Torch-like style' for pseudocode and integrating into the 'Py Torch framework', as well as the 'QLo RA framework', but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We create a vector bank of 90 vectors of a length of 256, initialized with a uniform distribution U( 0.02, 0.02). The logits are initialized with a normal distribution N(0, 0.01). The learning rates for the vector bank and logit parameters are set to 0.001 and 0.01, respectively. We set the rank to 4 and k = 2 for all our experiments. |