Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization

Authors: Xinhao Yao, Hongjin Qian, Xiaolin Hu, Gengze Xu, Wei Liu, Jian Luan, Bin Wang, Yong Liu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on benchmark datasets validate the effectiveness of this approach, supporting our theoretical findings. Our analysis lays the theoretical groundwork for configuring and improving algorithms in LLMs fine-tuning.
Researcher Affiliation	Collaboration	1Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 4Beijing Academy of Artificial Intelligence 5Xiao Mi
Pseudocode	No	The paper describes methods like LoRA and Prefix tuning using mathematical equations and textual explanations, but does not present any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Extended version and code, are available at https://github.com/ Xiao Mi/Efficient FT.
Open Datasets	Yes	Experimental results for our strategy (in Section 5) on benchmark datasets [Wang et al., 2018] and open source pre-trained models [Liu et al., 2019; AI@Meta, 2024] verify that the method can visibly influence fine-tuning efficiency.
Dataset Splits	Yes	We report results on development set, Pearson correlation for STSB, Matthew s correlation for Co LA, average accuracy for MNLI (matched and mismatched), and accuracy for other tasks.
Hardware Specification	No	The paper mentions 'limited computational resources' but does not provide any specific details about the hardware used, such as GPU or CPU models.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers, such as Python versions or library versions.
Experiment Setup	Yes	The LoRA hyperparameters were set to α = r = 8. All reported values represent the average results across 3 random seeds. ...evaluated the performance for λ values of 2, 4, and 8 (one can also determine a general optimal ratio through experiments, and even apply different settings across different layers of the model).