reproducibilityindex.ai

VFLAIR: A Research Library and Benchmark for Vertical Federated Learning

Authors: Tianyuan Zou, Zixuan GU, Yu He, Hideaki Takahashi, Yang Liu, Ya-Qin Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also benchmark 11 attacks and 8 defenses performance under different communication and model partition settings and draw concrete insights and recommendations on the choice of defense strategies for different practical VFL deployment scenarios. We design VFLAIR, a lightweight and extensible VFL framework that aims to facilitate research development of VFL (see Fig. 1). We design standardized pipelines for VFL training and validation, supporting 13 datasets, 29 different local model architectures including linear regression, tree and neural networks, 6 different global models, 2 model partition settings, 5 communication protocols, 1 encryption method, 11 attacks and 8 defense methods, each implemented as a distinct module and can be easily extended. We propose new evaluation metrics and modules, and perform extensive experiments to benchmark various perspectives of VFL, from which we draw key insights on VFL system design choice, in order to promote future development and practical deployment of VFL.
Researcher Affiliation	Collaboration	Tianyuan Zou1, Zixuan Gu2, Yu He3, Hideaki Takahashi4, Yang Liu 1,5, and Ya-Qin Zhang1 1Institute for AI Industry Research, Tsinghua University, Beijing, China 2Weiyang College, Tsinghua University, Beijing, China 3School of Computer Science, Fudan University, Shanghai, China 4College of Arts and Sciences, The University of Tokyo, Tokyo, Japan 5Shanghai Artificial Intelligence Laboratory, China
Pseudocode	Yes	The training procedure is shown in detail in Algorithm 1 in Appendix B. Training and inference procedures of tree-based VFL are included in Algorithm 3 in Appendix B. Algorithm 1 A Basic VFL Training Procedure using Fed SGD. Algorithm 2 A vertical federated learning framework with Homomorphic Encryption (Zou et al., 2022). Algorithm 3 A Basic Training Process of Tree-based VFL.
Open Source Code	Yes	To address this need, we present an extensible and lightweight VFL framework VFLAIR (available at https://github.com/FLAIR-THU/VFLAIR), which supports VFL training with a variety of models, datasets and protocols, along with standardized modules for comprehensive evaluations of attacks and defense strategies. Our code is also available at https://github.com/FLAIR-THU/VFLAIR).
Open Datasets	Yes	Using VFLAIR, We benchmark the VFL main task performance using 13 datasets including MNIST (Yann Le Cun), CIFAR10 (Krizhevsky &Hinton, 2009), CIFAR100 (Krizhevsky &Hinton, 2009), NUSWIDE (Chua et al., 2009), Breast Cancer (Street et al., 1993), Diabetes (Kahn), Adult Income (Becker &Kohavi, 1996), Criteo (Guo et al., 2017), Avazu (Qu et al., 2018), Cora (Mc Callum et al., 2000), News20 (Lang, 1995),Credit (Dua &Graff, 2017) and Nursery (Dua &Graff, 2017). Detailed data partition strategies are included in Appendix H.1.
Dataset Splits	Yes	The MNIST dataset comprises handwritten digits and consists of a training set with 60, 000 examples, along with a test set containing 10, 000 examples, distributed across 10 classes. (Appendix H.1) The CIFAR10 dataset consists of 60, 000 colour images... 5, 000 for training and 1, 000 for testing. (Appendix H.1) Breast Cancer dataset... We use 20% of the whole dataset samples for testing and the rest for training. (Appendix H.1) Adult Income dataset... We use 30% of the whole dataset samples for testing and the rest for training. (Appendix H.1) Criteo... 90% used for training and the rest 10% used for testing following previous work (Fu et al., 2022c). (Appendix H.1)
Hardware Specification	Yes	We mainly use NVIDIA Ge Force RTX 3090 for all the benchmark experiments except for tree-based VFL related experiments for which we use Intel(R) Xeon(R) CPU E5-2650 v2 instead. (Appendix H) According to FATE... VFLAIR... a 1 core CPU with less than 4G memory and less than 4.0G hard disk is required for installation and environment preparation. (Appendix D, Table 9)
Software Dependencies	No	The paper mentions general software environments and components like 'PyTorch' in references and 'git clone', 'pip install' commands in setup figures, but it does not specify concrete version numbers for software libraries or dependencies (e.g., Python 3.x, PyTorch 1.x, CUDA x.x) used for the experiments.
Experiment Setup	Yes	We benchmark the VFL main task performance using 13 datasets... The local models used for both settings are detailed in Tab. 10 in Appendix H.1... Details on training model and training hyper-parameters for the following experiments are included in Tab. 10 (in Appendix H.1) and Appendix H.2 respectively. Specific hyper-parameters for each attack is listed in below if exist (Appendix H.3). Detailed defense related experimental hyper-parameter settings are listed below (Appendix H.4). For NN-based VFL, the learning rate and training epochs use for reporting the MP listed in Tabs. 3, 4, 6, 7 and 13 are included in Tabs. 11 and 12. A batchsize of 1024 is used throughout all the experiments (except for MNIST, Criteo, Avazau and News20-S5 which uses a batchsize of 2048, 8192, 8192, 128 respectively). For tree-based VFL, for reporting the MP listed in Tab. 5, each party is equipped with a number of 5 trees each of depth 6 under all circumstances. Note that learning rate is only utilized for XGBoost and is set to 0.003 in the experiments. (Appendix H.2)