FM-Delta: Lossless Compression for Storing Massive Fine-tuned Foundation Models
Authors: Wanyi Ning, Jingyu Wang, Qi Qi, Mengde Zhu, Haifeng Sun, Daixuan Cheng, Jianxin Liao, Ce Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical and theoretical analysis reveals that most fine-tuned models in cloud have a small difference (delta) from their pre-trained models. To this end, we propose a novel lossless compression scheme FM-Delta specifically for storing massive fine-tuned models in cloud. FM-Delta maps fine-tuned and pre-trained model parameters into integers with the same bits, and entropy codes their integer delta. In this way, cloud only needs to store one uncompressed pre-trained model and other compressed fine-tuned models. Extensive experiments have demonstrated that FM-Delta efficiently reduces cloud storage consumption for massive fine-tuned models by an average of around 50% with only negligible additional time in most end-to-end cases. |
| Researcher Affiliation | Academia | Wanyi Ning1 Jingyu Wang12 Qi Qi1 Mengde Zhu1 Haifeng Sun1 Daixuan Cheng1 Jianxin Liao1 Ce Zhang3 1 Beijing University of Posts and Telecommunications 2 Pengcheng Laboratory 3 University of Chicago {ningwanyi, wangjingyu, qiqi8266, arnoldzhu, hfsun}@bupt.edu.cn daixuancheng6@gmail.com, liaojx@bupt.edu.cn, cez@uchicago.edu |
| Pseudocode | No | The paper describes the algorithm in Section 4.1 and illustrates it in Figure 5, but does not present a formal pseudocode block or algorithm listing. |
| Open Source Code | Yes | Our code is available in https://github.com/ningwanyi/FM-Delta. |
| Open Datasets | Yes | We download four common model families for different learning tasks from the popular cloud provider Hugging Face, including Stable Diffusion(33), GPT2(34), Bert-large-uncased(35), and Res Net50(36). ... on Pokemon Stable Diffusion(37), Wikitext103 GPT2(38), SST2 BERT(39), and FER2013 Res Net50(40). |
| Dataset Splits | No | The paper does not explicitly provide training/test/validation dataset splits with percentages or sample counts for the experiments. |
| Hardware Specification | Yes | Our experiments were run on AMD Ryzen 9 5950X 16-Core Processors@2.2GHz (32 logical processors) with 251GB of main memory. |
| Software Dependencies | No | We implemented FM-Delta for models in Py Torch(50) format. Our experiments were run on AMD Ryzen 9 5950X 16-Core Processors@2.2GHz (32 logical processors) with 251GB of main memory. In our end-to-end simulation, we simulate the communication between cloud and users through Python 'socket' library. Model compression and decompression processes are parallel with network transfer through reading and writing models in chunks. |
| Experiment Setup | No | The paper mentions aspects of fine-tuning such as 'small learning rate and a limited number of steps' and tracking 'Euclidean distance' over 'Steps' (e.g., 1000 steps) for specific models and datasets, but it does not provide a comprehensive list of concrete hyperparameter values (e.g., specific learning rate value, batch size, or detailed optimizer configurations) for its experiments. |