Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Parameter-Efficient Fine-Tuning Design Spaces
Authors: Jiaao Chen, Aston Zhang, Xingjian Shi, Mu Li, Alex Smola, Diyi Yang
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments reveal the following design patterns... These patterns lead to new methods for parameter-efficient fine-tuning, which we show experimentally outperform existing strategies across various backbone models and NLP tasks. |
| Researcher Affiliation | Collaboration | Georgia Institute of Technology, Amazon Web Services, Stanford University |
| Pseudocode | No | The paper describes its methods and components in detail but does not include any explicitly labeled pseudocode blocks or algorithms in a structured format. |
| Open Source Code | Yes | We will release our code at https://github.com/amazon-science/peft-design-spaces. |
| Open Datasets | Yes | Our process is based on the average performance on the widely-used GLUE benchmark (Wang et al., 2018). It covers a wide range of natural language understanding tasks. |
| Dataset Splits | Yes | Our process is based on the average performance on the widely-used GLUE benchmark (Wang et al., 2018). ... To quantify the overall quality of models in any design space Si with a low-compute, low-epoch regime (Radosavovic et al., 2020), we randomly sample 100 models from Si, fine-tune with only 3 epochs. |
| Hardware Specification | Yes | All the experiments were performed using 8 A100 GPUs. |
| Software Dependencies | No | The paper mentions 'We use Hugging Face Transformers for our implementations' but does not specify a version number for this or any other software dependency. |
| Experiment Setup | Yes | The batch size was 128 for base models and 64 for large models. The maximum learning rate was 5e-5 and the maximum number of training epochs was set to be either 5 or 10. |