Asymmetry in Low-Rank Adapters of Foundation Models

Authors: Jiacheng Zhu, Kristjan Greenewald, Kimia Nadjahi, Haitz Sáez De Ocáriz Borde, Rickard Brüel Gabrielsson, Leshem Choshen, Marzyeh Ghassemi, Mikhail Yurochkin, Justin Solomon

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We support our conclusions with experiments on Ro BERTa, BART-Large, LLa MA-2, and Vi Ts. The code and data is available at https://github.com/ Jiacheng-Zhu-AIML/Asymmetry Lo RA
Researcher Affiliation Collaboration 1MIT CSAIL 2IBM Research 3MIT-IBM Watson AI Lab 4University of Oxford. Correspondence to: Jiacheng Zhu <zjc@mit.edu>.
Pseudocode No The paper does not include any explicitly labeled pseudocode blocks or algorithms.
Open Source Code Yes The code and data is available at https://github.com/ Jiacheng-Zhu-AIML/Asymmetry Lo RA
Open Datasets Yes We evaluate the performance of fine-turning strategies on natural language understanding (GLUE (Wang et al., 2018), MMLU (Hendrycks et al., 2020)), natural language generation (XSum (Narayan et al., 2018) and CNN/Daily Mail (Chen et al., 2016)), and multi-domain image classification (Gulrajani & Lopez-Paz, 2020).
Dataset Splits Yes We adhere to the original 80% training and 20% testing splits.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions using "Py Torch" and the "Huggingface Transformers code base" but does not specify version numbers for these or any other software dependencies.
Experiment Setup Yes The configuration of our experiments on text generation is listed in Table 10. Table 8. Hyper-parameter setup for GLUE tasks. Table 9. Hyper-parameter setup for summarization tasks. Table 10. Hyper-parameter setup for summarization tasks.