reproducibilityindex.ai

On the Effectiveness of Parameter-Efficient Fine-Tuning

Authors: Zihao Fu, Haoran Yang, Anthony Man-Cho So, Wai Lam, Lidong Bing, Nigel Collier

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on several tasks. The experimental results show that our proposed SAM model outperforms many strong baseline models and it also veriﬁes our theoretical analysis.
Researcher Affiliation	Collaboration	1Language Technology Lab, University of Cambridge 2The Chinese University of Hong Kong 3DAMO Academy, Alibaba Group
Pseudocode	No	No structured pseudocode or algorithm blocks were found.
Open Source Code	Yes	The source code of this paper can be obtained from https://github.com/fuzihaofzh/Analyze Parameter Efficient Finetune
Open Datasets	Yes	We build our models with the jiant framework and test our models on several GLUE (Wang et al. 2018) and Super GLUE (Wang et al. 2019) tasks. ... we choose several tasks including Corpus of Linguistic Acceptability (Co LA) (Warstadt, Singh, and Bowman 2019), Semantic Textual Similarity Benchmark (STSB) (Cer et al. 2017), Microsoft Research Paraphrase Corpus (MRPC) (Dolan and Brockett 2005), Recognizing Textual Entailment (RTE) (Dagan, Glickman, and Magnini 2005; Bentivogli et al. 2009), Commitment Bank (CB) (De Marneffe, Simons, and Tonhauser 2019), Choice of Plausible Alternatives (COPA) (Roemmele, Bejan, and Gordon 2011), and Winograd Schema Challenge (WSC) (Levesque, Davis, and Morgenstern 2012).
Dataset Splits	Yes	Different from many previous works that train models without validation, we split the original training set by randomly sampling 10% as the new development set while using the remaining 90% samples to train the model.
Hardware Specification	Yes	We run the models on NVIDIA TITAN RTX GPU with 24GB memory.
Software Dependencies	No	The paper mentions 'jiant framework', 'Adapter Hub', 'loralib', and 'transformers toolkit' but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	Instead of training the model for ﬁxed epoch number, we use the new development set to do an early stop training by setting the tolerance for all models to 40. ... Following the setting of Guo, Rush, and Kim (2021), we set the sparsity to 0.005 for all models for a fair comparison. In SAM, we calculate L(θ0)i by accumulating the gradient for a few burn-in steps as we cannot load all the training data into memory, the burn-in steps are chosen from {500, 600, 700, 800, 900, 1000, 2000} on the development set as a hyper-parameter.