reproducibilityindex.ai

LESS-VFL: Communication-Efficient Feature Selection for Vertical Federated Learning

Authors: Timothy Castiglia, Yi Zhou, Shiqiang Wang, Swanand Kadhe, Nathalie Baracaldo, Stacy Patterson

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide extensive empirical evidence that LESS-VFL can achieve high accuracy and remove spurious features at a fraction of the communication cost of other feature selection approaches.
Researcher Affiliation	Collaboration	1Rensselaer Polytechnic Institute 2IBM Research.
Pseudocode	Yes	Algorithm 1 LESS-VFL implemented using P-SGD
Open Source Code	No	The paper does not contain an explicit statement about the release of its source code, nor does it provide a link to a code repository.
Open Datasets	Yes	MIMIC-III (Johnson et al., 2016; Harutyunyan et al., 2019): Hospital dataset... Activity (Anguita et al., 2013): Time-series positional data... Phishing (Dua & Graff, 2017): Dataset... Gina (Guyon, 2007): Hand-written two-digit images. Sylva (Guyon, 2007): Forest cover type information.
Dataset Splits	No	The paper mentions splitting features among parties but does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or references to predefined splits).
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models, memory specifications, or cloud computing instance types.
Software Dependencies	No	The paper mentions the ADAM optimizer and P-SGD but does not provide specific version numbers for any software libraries, frameworks, or operating systems used in the experimental setup.
Experiment Setup	Yes	We run a grid search to determine regularization parameters for LESS-VFL, local lasso, and group lasso, and the number of pre-training epochs for LESS-VFL and local lasso. We use the ADAM optimizer with a learning rate of 0.01 when employing Algorithm 2 in VFL (Original and Spurious) and pre-training and post feature selection in local lasso and LESS-VFL. We run 150 epochs of P-SGD for embedding component selection in LESS-VFL and feature selection in LESS-VFL and local lasso, which we found to be a sufficient amount of iterations for the training loss to plateau.