reproducibilityindex.ai

HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption

Authors: Seewoo Lee, Garam Lee, Jung Woo Kim, Junbum Shin, Mun-Kyu Lee

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results for five well-known benchmark datasets show total training times of 567 3442 seconds, which is less than an hour. We implemented and evaluated HETAL using five well-known benchmark datasets (MNIST (Deng, 2012), CIFAR-10 (Krizhevsky et al., 2009), Face Mask Detection (Larxel, 2020), Derma MNIST (Yang et al., 2023), and SNIPS (Coucke et al., 2018)), in addition to two pre-trained models (Vi T (Dosovitskiy et al., 2021) and MPNet (Song et al., 2020)). Our experimental results showed training times of 4.29 to 15.72 seconds per iteration and total training times of 567 to 3442 seconds (less than an hour), with almost the same classification accuracy as nonencrypted training. The accuracy loss by encrypted training was 0.5% at most.
Researcher Affiliation	Collaboration	1University of California, Berkeley, US 2Crypto Lab Inc., Seoul, South Korea 3Inha University, Incheon, South Korea
Pseudocode	Yes	Algorithm 1 Row-wise softmax approximation; Algorithm 2 Diag ABT: Homomorphic evaluation of t AB; Algorithm 3 Rot Left( A , k); Algorithm 4 PRot Up(B, k); Algorithm 5 Diag ATB: Homomorphic evaluation of t A B
Open Source Code	Yes	*Our codes for the experiments are available at https://github.com/Crypto Lab Inc/HETAL.
Open Datasets	Yes	We used five benchmark datasets for image classification and sentiment analysis: MNIST (Deng, 2012), CIFAR-10 (Krizhevsky et al., 2009), Face Mask Detection (Larxel, 2020), Derma MNIST (Yang et al., 2023), and SNIPS (Coucke et al., 2018).
Dataset Splits	Yes	The client extracts features from training and validation data. Table 4 describes the number of samples in each split (train, validation, test) for each benchmark. The splits are already given for Derma MNIST and SNIPS datasets, and we randomly split original train sets into train and validation sets for the other datasets of the ratio 7:1.
Hardware Specification	Yes	We used an Intel Xeon Gold 6248 CPU at 2.50GHz, running with 64 threads, and a single Nvidia Ampere A40 GPU.
Software Dependencies	No	The paper mentions 'HEaa N (Crypto Lab)' and 'Num Py (Harris et al., 2020)' as software used but does not provide specific version numbers for these libraries or for the programming language (Python) used.
Experiment Setup	Yes	For early stopping, we set the patience to 3. Table 1 shows that we fine-tuned the encrypted models on all benchmark datasets within an hour. In addition, the accuracy drops of the encrypted models were at most 0.51%, compared to the unencrypted models with the same hyperparameters. (See Table 5 in the Appendix for hyperparameters and the number of epochs for early-stopping.).