VertiBench: Advancing Feature Distribution Diversity in Vertical Federated Learning Benchmarks

Authors: Zhaomin Wu, Junyi Hou, Bingsheng He

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our comprehensive evaluation of cuttingedge VFL algorithms provides valuable insights for future research in the field." and "This section benchmarks cutting-edge VFL algorithms, with a detailed review in Section 4.1. Experimental settings are outlined in Section 4.2, and results regarding VFL accuracy and synthetic-real correlation are in Sections 4.3 and 4.4, respectively.
Researcher Affiliation Academia Zhaomin Wu, Junyi Hou, Bingsheng He National University of Singapore {zhaomin,junyi.h,hebs}@comp.nus.edu.sg
Pseudocode Yes Algorithm 1: Feature Splitting by Correlation" and "Algorithm 5: Feature Splitting by Importance
Open Source Code Yes The Verti Bench source code is available on Git Hub (Wu et al., 2023a), with data splitting tools installable from Py PI (Wu et al., 2023b)." and "The code for this study is accessible via a Git Hub repository (Wu et al., 2023a), accompanied by a README.md file that provides guidelines for environment setup and result reproduction.
Open Datasets Yes Our experiments utilize 11 datasets: nine centralized ones (covtype (Blackard, 1998), msd (Bertin-Mahieux, 2011), gisette (Guyon et al., 2008), realsim (Andrew, 2015), epsilon (Guo-Xun et al., 2008), letter (Slate, 1991), radar (Khosravi, 2020), MNIST (Deng, 2012), CIFAR10 (Krizhevsky and Hinton, 2009)), and two real-world VFL datasets (NUS-WIDE (Chua et al., 2009), Vehicle (Duarte and Hu, 2004)), with detailed descriptions available in Appendix F." and "Although the website is currently under construction, we have made Satellite available via a public Google Drive link (Anonymized, 2023) for review purposes.
Dataset Splits Yes Each dataset is partitioned into 80% training and 20% testing instances except NUS-WIDE, MNIST, and CIFAR10 with pre-defined test set." and "To ensure the reliability of our results, we conduct five runs for each algorithm, using seeds ranging from 0 to 4 to randomly split the datasets for each run, and then compute their mean metrics and standard deviation.
Hardware Specification Yes The hardware configuration used for C-VFL, GAL, Split NN, and Fed Tree consists of 2x AMD EPYC 7543 32-Core Processors, 4x A100 GPUs, and 503.4 GB of RAM, running on Python 3.10.11 with Py Torch 2.0.0, Linux 5.15.0-71-generic, Ubuntu 22.04.2 LTS." and "Pivot is compiled from source using CMake 3.19.7, g++ 9.5.0, libboost 1.71.0, libscapi with git commit hash 1f70a88, and runs on a slurm cluster with AMD EPYC 7V13 64-Core Processor with the same number of cores as 2x AMD EPYC 7543 used for other algorithms." and "We conducted real distributed experiments on four distinct machines to evaluate the communication cost, the machines are equipped with AMD EPYC 7V13 64-Core Processors, 503Gi RAM, 1Gb E NICs (Intel I350), and varying GPUs (AMD Instinct MI210 for P1 and P2, AMD Instinct MI100 for P3 and P4).
Software Dependencies Yes The hardware configuration used for C-VFL, GAL, Split NN, and Fed Tree consists of 2x AMD EPYC 7543 32-Core Processors, 4x A100 GPUs, and 503.4 GB of RAM, running on Python 3.10.11 with Py Torch 2.0.0, Linux 5.15.0-71-generic, Ubuntu 22.04.2 LTS." and "For FATE framework, we are using federatedai/standalone_fate Docker image, running with Python 3.8.13 on Docker 23.0.2." and "Pivot is compiled from source using CMake 3.19.7, g++ 9.5.0, libboost 1.71.0, libscapi with git commit hash 1f70a88" and "The distributed experimental environment consisted of Python 3.9.12 and Py Torch 2.0.1+rocm5.4.2.
Experiment Setup Yes For models based on split-GBDT, such as Secure Boost, Fed Tree, and Pivot, our experiments are conducted with the following hyperparameters: learning_rate=0.1, num_trees=50, max_bin=32, and max_depth=6." and "With regard to split-NN-based models, specifically Split NN and C-VFL, each local model is trained by a two-layer multi-layer perceptron (MLP) with each hidden layer containing 100 units. The corresponding aggregated model is a single-layer MLP with 200 hidden units. The learning rate, chosen from the set {10 4, 10 3, 3 10 3}, is contingent on the specific algorithm and dataset. The number of iterations is fixed at 50 for Split NN and 200 for C-VFL, with the latter setting aimed at ensuring model convergence. We also test C-VFL using four quantization buckets, a single vector quantization dimension, and a top-k compressor as recommended in the default setting. The number of local rounds Q in C-VFL is set to 10." and "Finally, for the ensemble-based model, GAL, we utilize a learning_rate=0.01, local_epoch=20, global_epoch=20, and batch_size=512, with the assist mode set to stack. In the GAL framework, each party employs an MLP model consisting of two hidden layers, each containing 100 hidden units.