reproducibilityindex.ai

NOLA: Compressing LoRA using Linear Combination of Random Basis

Authors: Soroush Abbasi Koohpayegani, Navaneet K L, Parsa Nooralinejad, Soheil Kolouri, Hamed Pirsiavash

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present adaptation results using GPT-2, LLa MA-2, and Vi T in natural language and computer vision tasks. NOLA performs as well as Lo RA models with much fewer number of parameters compared to Lo RA with rank one, the best compression Lo RA can archive. Particularly, on LLa MA-2 70B, our method is almost 20 times more compact than the most compressed Lo RA without degradation in accuracy.
Researcher Affiliation	Academia	1University of California, Davis 2 Vanderbilt University
Pseudocode	No	The paper does not contain a figure, block, or section explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured steps formatted like code or an algorithm.
Open Source Code	Yes	Our code is available here: https://github.com/UCDvision/NOLA
Open Datasets	Yes	Datasets: We utilize the following datasets for our Natural Language Generation (NLG) task: E2E NLG Challenge (Novikova et al., 2017) serves as a commonly used benchmark for evaluating NLG models. DART (Nan et al., 2020) is yet another significant dataset employed for evaluating text-to-data generation. Web NLG (Gardent et al., 2017) is a text-to-data dataset... We use CIFAR10 (Krizhevsky et al., 2014), CIFAR100 (Krizhevsky et al., 2009), CUB-200-2011 (Welinder et al., 2010), Caltech-101 (Fei-Fei et al., 2004), Aircraft (Maji et al., 2013), Food101 (Bossard et al., 2014), Pets (Parkhi et al., 2012) and SUN397 (Xiao et al., 2010) datasets for finetuning.
Dataset Splits	Yes	Datasets: We utilize the following datasets for our Natural Language Generation (NLG) task: E2E NLG Challenge (Novikova et al., 2017) serves as a commonly used benchmark for evaluating NLG models. It encompasses of 51,200 samples, distributed as follows: 42,200 for training, 4,600 for validation, and an additional 4,600 for testing.
Hardware Specification	Yes	Implementation Details: We trained our models using a single NVIDIA RTX 6000 Ada Generation GPU. (...) We optimize for one epoch on the Alpaca dataset with a batch size of 256 using four RTX 3090 GPUs.
Software Dependencies	No	The paper mentions software like the 'Timm library' and 'Adam optimizer' but does not provide specific version numbers for any of its software dependencies.
Experiment Setup	Yes	Implementation Details: We train our models for 5 epochs with a learning rate of 0.1 and no weight decay. We use a batch size of 8. We use a rank of 8 for NOLA in our experiments. Like Lo RA, we scale A B with c/r, where c is a hyperparameter and r is the rank. We use the default value of c = 1.