reproducibilityindex.ai

Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs

Authors: Yuxin Zhang, Lirui Zhao, Mingbao Lin, Sun Yunyun, Yiwu Yao, Xingjia Han, Jared Tanner, Shiwei Liu, Rongrong Ji

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on LLa MA-V1/V2, Vicuna, and OPT across various benchmarks demonstrate the effectiveness of DS T in enhancing the performance of sparse LLMs, especially at high sparsity levels.
Researcher Affiliation	Collaboration	Yuxin Zhang1,2 Lirui Zhao1 Mingbao Lin3 Yunyun Sun4 Yiwu Yao4 Xingjia Han4 Jared Tanner5 Shiwei Liu5,6,7 Rongrong Ji1,8 1Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University 2 Pengcheng Lab 3 Tencent Youtu Lab 4Huawei Technologies, 5University of Oxford, 6University of Texas at Austin 7Eindhoven University of Technology, 8Institute of Artificial Intelligence, Xiamen University
Pseudocode	Yes	Algorithm 1: Pseudocode of DS T.
Open Source Code	Yes	Codes are available at https://github.com/zyxxmu/DSno T.
Open Datasets	Yes	calibration data consists of 128 segments, each with 2048 tokens. These segments are randomly selected from the first shard of the C4 dataset (Raffel et al., 2020).
Dataset Splits	Yes	we assess the performance of pruned models by calculating the perplexity of language generation experiments on separate validation sets derived from Wiki Text2 (Merity et al., 2016).
Hardware Specification	Yes	All pruning experiments are conducted on NVIDIA A100 GPUs with 80GB of memory.
Software Dependencies	No	We implement DS T in Py Torch (Paszke et al., 2019) and use the Hugging Face Transformers library (Wolf et al., 2019) for handling models and datasets. (No version numbers provided for PyTorch or Hugging Face Transformers).
Experiment Setup	Yes	For the hyper-parameter settings, we set the maximum cycle T = 50 and the update threshold ϵ = 0.1 in all experiments.