MPCFORMER: FAST, PERFORMANT AND PRIVATE TRANSFORMER INFERENCE WITH MPC

Authors: Dacheng Li, Hongyi Wang, Rulin Shao, Han Guo, Eric Xing, Hao Zhang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive evaluations, we show that MPCFORMER significantly speeds up Transformer inference in MPC settings while achieving similar ML performance to the input model. On the IMDb dataset, it achieves similar performance to BERTBASE, while being 5.3 faster. On the GLUE benchmark, it achieves 97% performance of BERTBASE with a 2.2 speedup.
Researcher Affiliation Collaboration c Carnegie Mellon University m Mohamed bin Zayed University of Artificial Intelligence p Petuum Inc. b University of California, Berkeley
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (e.g., a figure or section explicitly labeled "Algorithm" or "Pseudocode").
Open Source Code Yes Code is available at https://github.com/MccRee177/MPCFormer.
Open Datasets Yes We evaluate our MPCFormer framework with different approximations and compare it with baselines on the IMDb dataset and the GLUE benchmark (Maas et al., 2011; Wang et al., 2018).
Dataset Splits No The paper mentions using the IMDb and GLUE benchmarks, which typically have standard splits. However, it does not explicitly provide the training/test/validation dataset splits (e.g., percentages, sample counts, or direct citations to the specific split definitions) within the text.
Hardware Specification Yes We use two P3.2x AWS instances to simulate the inference service scenarios (one P3.2x for the model provider, and one for the user). Each instance is equipped with one Tesla V100 GPU, and 10Gb E Ethernet bandwidth.
Software Dependencies No The paper mentions using 'Cryp Ten' and 'Hugging Face' for experiments. However, it does not provide specific version numbers for these software components or any other key libraries/solvers required for reproduction.
Experiment Setup Yes Baselines are trained with learning rates tuned from 1e-6, 5e-6, 1e-5, and 1e-4, the number of epochs from 10, 30, and 100, the batch size 32 for IMDB, batch sizes 64 and 256 for GLUE. MPCBert-B is trained with learning rate 5e-5 for embedding and Transformer layer distillation, and 1e-5 for prediction layer distillation.