Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
Authors: Xinhao Yao, Hongjin Qian, Xiaolin Hu, Gengze Xu, Wei Liu, Jian Luan, Bin Wang, Yong Liu
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on benchmark datasets validate the effectiveness of this approach, supporting our theoretical findings. Our analysis lays the theoretical groundwork for configuring and improving algorithms in LLMs fine-tuning. |
| Researcher Affiliation | Collaboration | 1Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 4Beijing Academy of Artificial Intelligence 5Xiao Mi |
| Pseudocode | No | The paper describes methods like LoRA and Prefix tuning using mathematical equations and textual explanations, but does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Extended version and code, are available at https://github.com/ Xiao Mi/Efficient FT. |
| Open Datasets | Yes | Experimental results for our strategy (in Section 5) on benchmark datasets [Wang et al., 2018] and open source pre-trained models [Liu et al., 2019; AI@Meta, 2024] verify that the method can visibly influence fine-tuning efficiency. |
| Dataset Splits | Yes | We report results on development set, Pearson correlation for STSB, Matthew s correlation for Co LA, average accuracy for MNLI (matched and mismatched), and accuracy for other tasks. |
| Hardware Specification | No | The paper mentions 'limited computational resources' but does not provide any specific details about the hardware used, such as GPU or CPU models. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers, such as Python versions or library versions. |
| Experiment Setup | Yes | The LoRA hyperparameters were set to α = r = 8. All reported values represent the average results across 3 random seeds. ...evaluated the performance for λ values of 2, 4, and 8 (one can also determine a general optimal ratio through experiments, and even apply different settings across different layers of the model). |