Sparse High Rank Adapters
Authors: Kartikeya Bhardwaj, Nilesh Pandey, Sweta Priyadarshi, Viswanath Ganapathy, Shreya Kadambi, Rafael Esteves, Shubhankar Borse, Paul Whatmough, Risheek Garrepalli, Mart van Baalen, Harris Teague, Markus Nagel
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments on LVMs and LLMs demonstrate that finetuning only a small fraction of the parameters in the base model significantly outperforms Lo RA while enabling both rapid switching and multi-adapter fusion. |
| Researcher Affiliation | Industry | Kartikeya Bhardwaj Nilesh Prasad Pandey Sweta Priyadarshi Viswanath Ganapathy Shreya Kadambi Rafael Esteves Shubhankar Borse Paul Whatmough Risheek Garrepalli Mart Van Baalen Harris Teague Markus Nagel Qualcomm AI Research {kbhardwa,pwhatmou,hteague,markusn}@qti.qualcomm.com |
| Pseudocode | No | The paper includes mathematical formulations and conceptual diagrams but does not contain explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code: https://github.com/Qualcomm-AI-research/SHi RA. |
| Open Datasets | Yes | On the language domain, we experiment with LLa MA 7B [29], LLa MA2-7B [30] and evaluate it on various commonsense reasoning benchmarks such as Hella Swag, PIQA, SIQA, Bool Q, Arc-easy, Arc-challenge, Open Book QA and Winogrande. ... Images present in both of these datasets are collected from public-domain (CC-0 license). |
| Dataset Splits | Yes | Table 8: Commonsense Benchmarks (lists #Train, #Val, Test for each dataset). ... The Bluefire dataset ... The validation of the Bluefire dataset consists of 30 images. ... The validation set of the Paintings dataset consists of 21 images... |
| Hardware Specification | Yes | All finetuning and evaluation experiments for language and vision tasks are done using a single NVIDIA A100 GPU. |
| Software Dependencies | Yes | Specifically, SHi RA can be implemented directly using a functionality called post_accumulate_gradient_hooks available in Pytorch 2.1.0. |
| Experiment Setup | Yes | Table 9: Training hyperparameters used for finetuning experiments. |