reproducibilityindex.ai

Secure Quantized Training for Deep Learning

Authors: Marcel Keller, Ke Sun

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement training of neural networks in secure multi-party computation (MPC) using quantization commonly used in said setting. We are the ﬁrst to present an MNIST classiﬁer purely trained in MPC that comes within 0.2 percent of the accuracy of the same convolutional neural network trained via plaintext computation. More concretely, we have trained a network with two convolutional and two dense layers to 99.2% accuracy in 3.5 hours (under one hour for 99% accuracy). We have also implemented Alex Net for CIFAR-10, which converges in a few hours. We develop novel protocols for exponentiation and inverse square root. Finally, we present experiments in a range of MPC security models for up to ten parties, both with honest and dishonest majority as well as semi-honest and malicious security.
Researcher Affiliation	Collaboration	1CSIRO s Data61, Sydney, Australia 2The Australian National University.
Pseudocode	Yes	Algorithm 1 Exponentiation with base two (Aly & Smart, 2019)
Open Source Code	Yes	Code available at https://github.com/data61/ MP-SPDZ.
Open Datasets	Yes	For a concrete measurement of accuracy and running times, we train a multi-class classiﬁer3 for the widely-used MNIST dataset (Le Cun et al., 2010). We have also implemented Alex Net for CIFAR-10, which converges in a few hours.
Dataset Splits	Yes	We use SGD with learning rate 0.01, batch size 128, and the usual MNIST training/test split. Test/training split We have used the usual MNIST split.
Hardware Specification	Yes	We use the CPU of one AWS c5.9xlarge instance per party whereas Tan et al. use one NVIDIA Tesla V100 GPU per party.
Software Dependencies	No	We build our implementation on MP-SPDZ by Keller (2020). Other software like TensorFlow and Keras are mentioned without explicit version numbers.
Experiment Setup	Yes	We use SGD with learning rate 0.01, batch size 128, and the usual MNIST training/test split. In the following we discuss our choice of hyperparameters. Number of epochs As we found convergence after 100 epochs, we have run most of our benchmarks for 150 epochs... Mini-batch size We have used 128 throughout... Learning rate ...we settled for 0.01 for SGD and 0.001 for AMSGrad... Hyperparameters for Adam/AMSGrad We use the common choice β1 = 0.9, β2 = 0.999, and ϵ = 10-8.