FouRA: Fourier Low-Rank Adaptation

Authors: Shubhankar Borse, Shreya Kadambi, Nilesh Pandey, Kartikeya Bhardwaj, Viswanath Ganapathy, Sweta Priyadarshi, Risheek Garrepalli, Rafael Esteves, Munawar Hayat, Fatih Porikli

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments and analysis, we show that Fou RA successfully solves the problems related to data copying and distribution collapse while significantly improving the generated image quality. We demonstrate that Fou RA enhances the generalization of fine-tuned models thanks to its adaptive rank selection. We further show that the learned projections in the frequency domain are decorrelated and prove effective when merging multiple adapters. While Fou RA is motivated for vision tasks, we also demonstrate its merits for language tasks on commonsense reasoning and GLUE benchmarks. 5 Experiments
Researcher Affiliation Industry Qualcomm AI Research {sborse, skadambi, mhayat, fporikli}@qti.qualcomm.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: Datasets and code will be provided upon request, as we need a legal approval for the same. We are also working on the legal process to provide git access.
Open Datasets Yes Datasets: For style transfer, we evaluate Fou RA on four datasets collected from public domains, including Bluefire , Paintings, 3D and Origami styles, see Appendix C.1.3 for details. ... The common sense reasoning training dataset is a combination of the training datasets provided by [20], while we evaluate each evaluation dataset separately. ... All the dataset and task described in the Table C.2 is being utilized from Huggingface Datasets and each task has its own respective evaluation metric.
Dataset Splits Yes Blue Fire (Validation): The Bluefire validation set consists of 30 curated text prompts, of which 9 prompts contain one of 6 categories on which the model was trained, and the remaining 21 prompts correspond to categories which the low-rank adapter has not been fine-tuned on. ... Table C.1: Commonsense Benchmark (includes #Val column) ... Table C.2: GLUE Benchmark (includes #Val column)
Hardware Specification Yes We trained using 4 NVIDIA A100 GPUs, for 100 epochs at at batch size of 8. ... The training measurements are performed on Tesla A-100 GPU with a batch-size of 8.
Software Dependencies No We used the kohya-ss4 repository for finetuning models for the text-to-image stylization task. ... Lo RA and Fou RA modules are applied in the default places for stable-diffusion-v1.5 backbone, same as in Hugging Face Diffusers. (Does not provide specific version numbers for software dependencies).
Experiment Setup Yes For each task, we trained both Lo RA and Fou RA adapters with the same set of hyperparameters. We trained using 4 NVIDIA A100 GPUs, for 100 epochs at at batch size of 8. Our initial learning rate was set to 1e-4 for UNet and 5e-5 for the text encoder. ... For some ablation studies, we varied the rank between 16, 32, 48, 64. In all the remaining experiments, we set the rank at 64 unless stated otherwise.