Expanding Sparse Tuning for Low Memory Usage
Authors: Shufan Shen, Junshu Sun, Xiangyang Ji, Qingming Huang, Shuhui Wang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on multiple downstream tasks show that SNELL achieves state-of-the-art performance with low memory usage, endowing PEFT with sparse tuning to large-scale models. |
| Researcher Affiliation | Academia | 1Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS 2University of Chinese Academy of Sciences 3Tsinghua University 4Peng Cheng Laboratory |
| Pseudocode | No | The paper does not contain pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Codes are available at https://github.com/ssfgunner/SNELL. |
| Open Datasets | Yes | We evaluate our methods on 24 downstream tasks categorized into two groups following SPT [22]. (i) FGVC [30] is a benchmark for fine-grained image classification. ... (ii) VTAB-1k [59] is a large-scale transfer learning benchmark consisting of 19 visual classification tasks. |
| Dataset Splits | Yes | We follow the validation splits in [22] if the official validation set is unavailable. |
| Hardware Specification | Yes | Table A14: Training time cost on Vi T-B/16 of different PEFT methods using NVIDIA Ge Force RTX 4090 GPU. |
| Software Dependencies | No | Following SPT [22], we use the Adam W optimizer [40] with cosine learning rate decay. ... No specific version numbers for software dependencies are provided. |
| Experiment Setup | Yes | The batch size, learning rate, and weight decay are 32, 1e 3, and 1e 4, respectively. We also follow SPT [22] to implement the standard data augmentation pipeline for VTAB-1K and follow SSF [35] for FGVC as well. |