Label Leakage and Protection in Two-party Split Learning
Authors: Oscar Li, Jiankai Sun, Xin Yang, Weihao Gao, Hongyi Zhang, Junyuan Xie, Virginia Smith, Chong Wang
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically1 demonstrate the effectiveness of our protection techniques against the identified attacks, and show that Marvell in particular has improved privacy-utility tradeoffs relative to baseline approaches. ... We experimentally demonstrate the effectiveness of our protection techniques and MARVELL’s improved privacy-utility tradeoffs compared to other protection baselines (Section 5). ... In this section, we first describe our experiment setup and then demonstrate the label protection quality of Marvell as well as its privacy-utility trade-off relative to baseline approaches. Empirical Setup. We use three real-world binary classification datasets for evaluation: Criteo and Avazu, two online advertising prediction datasets with millions of examples; and ISIC, a healthcare image dataset for skin cancer prediction. |
| Researcher Affiliation | Collaboration | Oscar Li1 , Jiankai Sun2 , Xin Yang2, Weihao Gao2, 1Carnegie Mellon University Hongyi Zhang2, Junyuan Xie2, Virginia Smith1, Chong Wang2 2Byte Dance Inc. |
| Pseudocode | Yes | Algorithm 1: Marvell algorithm |
| Open Source Code | Yes | 1Code available at https://github.com/OscarcarLi/label-protection |
| Open Datasets | Yes | We use three real-world binary classification datasets for evaluation: Criteo and Avazu, two online advertising prediction datasets with millions of examples; and ISIC, a healthcare image dataset for skin cancer prediction. ... Criteo. Criteo display advertising challenge, 2014. URL https://www.kaggle.com/c/ criteo-display-ad-challenge/data. ... Avazu. Avazu click-through rate prediction, 2015. URL https://www.kaggle.com/c/ avazu-ctr-prediction/data. ... ISIC. Siim-isic melanoma classification, 2020. URL https://www.kaggle.com/c/ siim-isic-melanoma-classification/data. |
| Dataset Splits | No | The paper specifies train-test splits (e.g., "90%-10% train-test split" for Criteo and Avazu, "80%-20% training and test split" for ISIC), but it does not mention a distinct validation set split. |
| Hardware Specification | Yes | We conduct our experiments over 16 Nvidia 1080Ti GPU card. |
| Software Dependencies | No | The paper mentions using the "Adam optimizer" but does not specify its version or any other software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow versions). |
| Experiment Setup | Yes | [Criteo] We use the Adam optimizer with a batch size of 1024 and a learning rate of 1e-4 throughout the entire training of 5 epochs (approximately 20k stochastic gradient updates). [ISIC] We use the Adam optimizer with a batch size of 128 and a learning rate of 1e-5 throughout the entire training of 1000 epochs (approximately 35k stochastic gradient updates). [Avazu] We use the Adam optimizer with a batch size of 32768 and a learning rate of 1e-4 throughout the entire training of 5 epochs (approximately 5.5k stochastic gradient updates). |