Sparse High Rank Adapters

Authors: Kartikeya Bhardwaj, Nilesh Pandey, Sweta Priyadarshi, Viswanath Ganapathy, Shreya Kadambi, Rafael Esteves, Shubhankar Borse, Paul Whatmough, Risheek Garrepalli, Mart van Baalen, Harris Teague, Markus Nagel

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments on LVMs and LLMs demonstrate that finetuning only a small fraction of the parameters in the base model significantly outperforms Lo RA while enabling both rapid switching and multi-adapter fusion.
Researcher Affiliation Industry Kartikeya Bhardwaj Nilesh Prasad Pandey Sweta Priyadarshi Viswanath Ganapathy Shreya Kadambi Rafael Esteves Shubhankar Borse Paul Whatmough Risheek Garrepalli Mart Van Baalen Harris Teague Markus Nagel Qualcomm AI Research {kbhardwa,pwhatmou,hteague,markusn}@qti.qualcomm.com
Pseudocode No The paper includes mathematical formulations and conceptual diagrams but does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code: https://github.com/Qualcomm-AI-research/SHi RA.
Open Datasets Yes On the language domain, we experiment with LLa MA 7B [29], LLa MA2-7B [30] and evaluate it on various commonsense reasoning benchmarks such as Hella Swag, PIQA, SIQA, Bool Q, Arc-easy, Arc-challenge, Open Book QA and Winogrande. ... Images present in both of these datasets are collected from public-domain (CC-0 license).
Dataset Splits Yes Table 8: Commonsense Benchmarks (lists #Train, #Val, Test for each dataset). ... The Bluefire dataset ... The validation of the Bluefire dataset consists of 30 images. ... The validation set of the Paintings dataset consists of 21 images...
Hardware Specification Yes All finetuning and evaluation experiments for language and vision tasks are done using a single NVIDIA A100 GPU.
Software Dependencies Yes Specifically, SHi RA can be implemented directly using a functionality called post_accumulate_gradient_hooks available in Pytorch 2.1.0.
Experiment Setup Yes Table 9: Training hyperparameters used for finetuning experiments.