reproducibilityindex.ai

Sparse Structure Search for Delta Tuning

Authors: Shengding Hu, Zhen Zhang, Ning Ding, Yadao Wang, Yasheng Wang, Zhiyuan Liu, Maosong Sun

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that S3Delta surpasses manual and random structures with less trainable parameters.
Researcher Affiliation	Collaboration	1Dept. of Comp. Sci. & Tech., Institute for AI, Tsinghua University, Beijing, China [...] 3Noah s Ark Lab, Huawei
Pseudocode	Yes	Algorithm 1 Algorithm of S3Delta
Open Source Code	Yes	Our codes are publicly available at https://github.com/thunlp/S3Delta.
Open Datasets	Yes	We apply S3Delta to multitask benchmarks GLUE [38] and Super GLUE [37] following previous works. All datasets are downloaded from the Hugging Face Datasets [19].
Dataset Splits	Yes	Since the test splits of these datasets are held officially and invisible to the researchers, we conduct random splits from either train set or validation set to make the new train, validation, and test splits, which is critical to ensure fair evaluations according to Chen et al. [4].
Hardware Specification	Yes	All the experiments are conducted on 8 NVIDIA GeForce RTX 3090 GPUs.
Software Dependencies	No	The paper mentions using Hugging Face Datasets but does not provide specific version numbers for any software libraries or dependencies, such as PyTorch, Python, or other relevant packages.
Experiment Setup	Yes	We fix the random seed as 42 for all experiments unless explicitly specified. We train our model for 200 epochs for structure search and 100 epochs for evaluation. The learning rate for fine-tuning is 1e-4, and for all delta tuning methods, it is 1e-3. We use AdamW as our optimizer. The batch size is 32.