Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Does Low Rank Adaptation Lead to Lower Robustness against Training-Time Attacks?

Authors: Zi Liang, Haibo Hu, Qingqing Ye, Yaxin Xiao, Ronghua Li

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we theoretically investigate the security implications of Lo RA s low-rank structure during fine-tuning, in the context of its robustness against data poisoning and backdoor attacks. We propose an analytical framework that models Lo RA s training dynamics, employs the neural tangent kernel to simplify the analysis of the training process, and applies information theory to establish connections between Lo RA s low rank structure and its vulnerability against training-time attacks. Our analysis indicates that Lo RA exhibits better robustness to backdoor attacks than full fine-tuning, while becomes more vulnerable to untargeted data poisoning due to its over-simplified information geometry. Extensive experimental evaluations have corroborated our theoretical findings.
Researcher Affiliation Academia 1The Hong Kong Polytechnic University, Hong Kong, China. Correspondence to: Haibo Hu <EMAIL>.
Pseudocode No The paper describes methods using mathematical equations and prose, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our source code is available at: https: //github.com/liangzid/Lo RA-s Security.
Open Datasets Yes we conduct fine-tuning of natural language understanding models on the GLUE benchmark (Wang et al., 2018) as our primary evaluation environment. Specifically, we utilize BERT-large (Devlin et al., 2019) as the backbone model and evaluate their performance on six binary classification tasks, including SST-2 (Socher et al., 2013), COLA (Warstadt et al., 2018), QNLI (Wang et al., 2018), QQP (Sharma et al., 2019), RTE (Poliak, 2020), and MRPC (Dolan & Brockett, 2005). We use the instruction-following dataset Alpaca (Taori et al., 2023) as the supervised fine-tuning (SFT) training set
Dataset Splits No The paper states the use of various GLUE benchmark tasks and the Alpaca dataset, but it does not explicitly provide specific dataset split percentages, sample counts, or detailed splitting methodology within the main text for reproduction. While these benchmarks often have standard splits, the paper does not specify them.
Hardware Specification Yes All experiments are conducted on eight 24 GB Nvidia RTX 4090 GPUs. These experiments are conducted on an Nvidia H100 GPU.
Software Dependencies No The paper does not explicitly list specific software dependencies with version numbers (e.g., Python version, PyTorch version, or other library versions).
Experiment Setup Yes The maximum sequence length is set to 512, and the batch size is fixed at 8. For learning rates, we apply 3 10 5 for Lo RA s low rank fine-tuning and 3 10 6 for both Lo RA s high rank fine-tuning and FF. Each fine-tuning procedure is conducted for a maximum of 10,000 steps. For Lo RA-specific settings, we use a rank of 8 and set the scaling parameter α to 16 as default values.