Expanding Sparse Tuning for Low Memory Usage

Authors: Shufan Shen, Junshu Sun, Xiangyang Ji, Qingming Huang, Shuhui Wang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on multiple downstream tasks show that SNELL achieves state-of-the-art performance with low memory usage, endowing PEFT with sparse tuning to large-scale models.
Researcher Affiliation Academia 1Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS 2University of Chinese Academy of Sciences 3Tsinghua University 4Peng Cheng Laboratory
Pseudocode No The paper does not contain pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Codes are available at https://github.com/ssfgunner/SNELL.
Open Datasets Yes We evaluate our methods on 24 downstream tasks categorized into two groups following SPT [22]. (i) FGVC [30] is a benchmark for fine-grained image classification. ... (ii) VTAB-1k [59] is a large-scale transfer learning benchmark consisting of 19 visual classification tasks.
Dataset Splits Yes We follow the validation splits in [22] if the official validation set is unavailable.
Hardware Specification Yes Table A14: Training time cost on Vi T-B/16 of different PEFT methods using NVIDIA Ge Force RTX 4090 GPU.
Software Dependencies No Following SPT [22], we use the Adam W optimizer [40] with cosine learning rate decay. ... No specific version numbers for software dependencies are provided.
Experiment Setup Yes The batch size, learning rate, and weight decay are 32, 1e 3, and 1e 4, respectively. We also follow SPT [22] to implement the standard data augmentation pipeline for VTAB-1K and follow SSF [35] for FGVC as well.