Ditto: Quantization-aware Secure Inference of Transformers upon MPC
Authors: Haoqi Wu, Wenjing Fang, Yancheng Zheng, Junming Ma, Jin Tan, Lei Wang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on Bert and GPT2 models to evaluate the performance of Ditto. The results demonstrate that Ditto is about 3.14 4.40 faster than MPCFormer (ICLR 2023) and 1.44 2.35 faster than the state-of-the-art work PUMA with negligible utility degradation. |
| Researcher Affiliation | Industry | 1Ant Group, Hangzhou, China. Correspondence to: Haoqi Wu <haoqi.whq@antgroup.com> |
| Pseudocode | Yes | Algorithm 1 Secure Up Cast Protocol... Algorithm 2 Approximated Ge LU Protocol... Algorithm 3 Approximated Softmax Protocol... Algorithm 4 Secure Down Cast Protocol |
| Open Source Code | Yes | The code is available at: https: //github.com/secretflow/spu. |
| Open Datasets | Yes | We use the pre-trained Bert models and GPT models in Hugging Face (Wolf et al., 2020). For Bert, we use Bert-base and Bert-large pre-trained over Book Corpus (Zhu et al., 2015) and English Wikipedia (Wikipedia contributors, 2004) datasets. For GPT, we use GPT2-base and GPT2-medium pre-trained over the Wikitext-103 dataset (Merity et al., 2016). |
| Dataset Splits | No | We evaluate Bert over RTE, Co LA, QQP and QNLI from GLUE benchmarks (Wang et al., 2019), and GPT2 on the validation set of Wikitext-103. (While "validation set" is mentioned, the explicit split sizes or percentages for reproducibility are not provided in the main text.) |
| Hardware Specification | Yes | We conduct the experiments on one Cent OS 8 machine equipped with one AMD Ryzen CPU (32 cores and 3.60GHz) and 256GB of RAM. |
| Software Dependencies | No | We implement Ditto upon the framework Secret Flow-SPU that supports privacy-preserving machine learning. (No specific version numbers are provided for SPU or other software dependencies.) |
| Experiment Setup | Yes | Experimental setup. We implement Ditto upon the framework Secret Flow-SPU... We conduct the experiments on one Cent OS 8 machine equipped with one AMD Ryzen CPU (32 cores and 3.60GHz) and 256GB of RAM. We consider two network environments: 1) LAN setting with a bandwidth of 5Gbps and 0.4ms round-trip time; 2) WAN setting with a bandwidth of 400Mbps and 40ms round-trip time. We simulate the network environments using the Linux tc tool. For Bert models, the input sequence length is set to 128... As for GPT2 models, we generate 1 new token with an input length of 32... Regarding the fine-tuning of Bert models... we use a batch size of 32 for Bert-base and 16 for Bert-large. All the inputs are of sequence length 128. We train the models for 3 epochs... We run a grid search with learning rate in [2e-5, 3e-5, 4e-5, 5e-5]. |