MPCFORMER: FAST, PERFORMANT AND PRIVATE TRANSFORMER INFERENCE WITH MPC
Authors: Dacheng Li, Hongyi Wang, Rulin Shao, Han Guo, Eric Xing, Hao Zhang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive evaluations, we show that MPCFORMER significantly speeds up Transformer inference in MPC settings while achieving similar ML performance to the input model. On the IMDb dataset, it achieves similar performance to BERTBASE, while being 5.3 faster. On the GLUE benchmark, it achieves 97% performance of BERTBASE with a 2.2 speedup. |
| Researcher Affiliation | Collaboration | c Carnegie Mellon University m Mohamed bin Zayed University of Artificial Intelligence p Petuum Inc. b University of California, Berkeley |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (e.g., a figure or section explicitly labeled "Algorithm" or "Pseudocode"). |
| Open Source Code | Yes | Code is available at https://github.com/MccRee177/MPCFormer. |
| Open Datasets | Yes | We evaluate our MPCFormer framework with different approximations and compare it with baselines on the IMDb dataset and the GLUE benchmark (Maas et al., 2011; Wang et al., 2018). |
| Dataset Splits | No | The paper mentions using the IMDb and GLUE benchmarks, which typically have standard splits. However, it does not explicitly provide the training/test/validation dataset splits (e.g., percentages, sample counts, or direct citations to the specific split definitions) within the text. |
| Hardware Specification | Yes | We use two P3.2x AWS instances to simulate the inference service scenarios (one P3.2x for the model provider, and one for the user). Each instance is equipped with one Tesla V100 GPU, and 10Gb E Ethernet bandwidth. |
| Software Dependencies | No | The paper mentions using 'Cryp Ten' and 'Hugging Face' for experiments. However, it does not provide specific version numbers for these software components or any other key libraries/solvers required for reproduction. |
| Experiment Setup | Yes | Baselines are trained with learning rates tuned from 1e-6, 5e-6, 1e-5, and 1e-4, the number of epochs from 10, 30, and 100, the batch size 32 for IMDB, batch sizes 64 and 256 for GLUE. MPCBert-B is trained with learning rate 5e-5 for embedding and Transformer layer distillation, and 1e-5 for prediction layer distillation. |