reproducibilityindex.ai

Parameter-Efficient Fine-Tuning with Controls

Authors: Chi Zhang, Cheng Jingpu, Yanyu Xu, Qianxiao Li

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical findings substantiate that, without introducing any additional parameters, this approach surpasses the Lo RA algorithms across all assessed datasets and rank configurations. 6. Experiment In this part, we evaluate the effectiveness of the nonlinear controllers by conducting a series of experiments on vision datasets. Table 1 reports the performance of all algorithms, with the same pre-trained Vi T backbone.
Researcher Affiliation	Academia	*Equal contribution 1Department of Maths, National University of Singapore, Singapore 2The Joint SDU-NTU Research Center of Artificial Intelligence, Shandong University, China. Correspondence to: Qianxiao Li <Qianxiao@nus.edu.sg>.
Pseudocode	No	No pseudocode or clearly labeled algorithm blocks were found in the paper.
Open Source Code	No	The paper does not provide any statement about making its source code available, nor does it include a link to a code repository.
Open Datasets	Yes	We commence with a numerical verification of the condition outlined in Theorem 4.2 through a small-size example. In particular, we consider a scenario wherein the original model is a randomly initialized 10-layer Vi T model. We now proceed to validate our approach on various vision benchmarks, including CIFAR100 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011) and Food-101 (Bossard et al., 2014).
Dataset Splits	No	The paper mentions using standard datasets like CIFAR100, SVHN, and Food-101 and mirroring experimental settings from Adapt Former (Chen et al., 2022). However, it does not explicitly state the train/validation/test dataset splits (e.g., percentages, sample counts, or specific split methodology) within the paper itself that would be needed for reproduction.
Hardware Specification	Yes	All experiments are conducted on the Nvidia-3090.
Software Dependencies	No	The paper mentions using 'Stochastic Gradient Descent (SGD) algorithm with a momentum of 0.9' but does not specify version numbers for any software components, libraries, or programming languages used (e.g., Python, PyTorch, etc.).
Experiment Setup	Yes	The Stochastic Gradient Descent (SGD) algorithm with a momentum of 0.9 is employed for optimizing the controls during the training process. Its batch-size is set to 128 and the learning rate is set to 0.05. The down-projection layer weights in the controls are initialized using Kaiming Norm (He et al., 2015), while the up-projection layer weights are set to 0. Analogously, all biases in the controls are initialized to 0.