Parameter-Efficient Fine-Tuning Design Spaces
Authors: Jiaao Chen, Aston Zhang, Xingjian Shi, Mu Li, Alex Smola, Diyi Yang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments reveal the following design patterns... These patterns lead to new methods for parameter-efficient fine-tuning, which we show experimentally outperform existing strategies across various backbone models and NLP tasks. |
| Researcher Affiliation | Collaboration | Georgia Institute of Technology, Amazon Web Services, Stanford University |
| Pseudocode | No | The paper describes its methods and components in detail but does not include any explicitly labeled pseudocode blocks or algorithms in a structured format. |
| Open Source Code | Yes | We will release our code at https://github.com/amazon-science/peft-design-spaces. |
| Open Datasets | Yes | Our process is based on the average performance on the widely-used GLUE benchmark (Wang et al., 2018). It covers a wide range of natural language understanding tasks. |
| Dataset Splits | Yes | Our process is based on the average performance on the widely-used GLUE benchmark (Wang et al., 2018). ... To quantify the overall quality of models in any design space Si with a low-compute, low-epoch regime (Radosavovic et al., 2020), we randomly sample 100 models from Si, fine-tune with only 3 epochs. |
| Hardware Specification | Yes | All the experiments were performed using 8 A100 GPUs. |
| Software Dependencies | No | The paper mentions 'We use Hugging Face Transformers for our implementations' but does not specify a version number for this or any other software dependency. |
| Experiment Setup | Yes | The batch size was 128 for base models and 64 for large models. The maximum learning rate was 5e-5 and the maximum number of training epochs was set to be either 5 or 10. |