Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Federated Learning with Reduced Information Leakage and Computation

Authors: Tongxin Yin, Xuwei Tan, Xueru Zhang, Mohammad Mahdi Khalili, Mingyan Liu

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on both synthetic and real-world data show that the Upcycled-FL strategy can be adapted to many existing FL frameworks and consistently improve the privacy-accuracy trade-off.1
Researcher Affiliation Academia Tongxin Yin* EMAIL Department of Electrical and Computer Engineering University of Michigan Xuwei Tan* EMAIL Department of Computer Science and Engineering The Ohio State University Xueru Zhang* EMAIL Department of Computer Science and Engineering The Ohio State University Mohammad Mahdi Khalili EMAIL Department of Computer Science and Engineering The Ohio State University Mingyan Liu EMAIL Department of Electrical and Computer Engineering University of Michigan
Pseudocode Yes Algorithm 1 Proposed aggregation strategy: Upcycled-FL
Open Source Code Yes Code available at https://github.com/osu-srml/Upcycled-FL.
Open Datasets Yes We adopt two real datasets: 1) FEMNIST, a federated version of EMNIST (Cohen et al., 2017). Here, A multilayer perceptron (MLP) consisting of two linear layers with a hidden dimension of 14x14, interconnected by Re LU activation functions, is used to learn from FEMNIST; 2) Sentiment140 (Sent140), a text sentiment analysis task on tweets (Go et al., 2009).
Dataset Splits Yes Table 2: Details of datasets. Numbers in parentheses represent the amount of test data. All of the numbers round to integer.
Hardware Specification Yes All experiments are conducted on a server equipped with multiple NVIDIA A5000 GPUs, two AMD EPYC 7313 CPUs, and 256GB memory.
Software Dependencies Yes The code is implemented with Python 3.8 and Py Torch 1.13.0 on Ubuntu 20.04.
Experiment Setup Yes We employ SGD as the local optimizer, with a momentum of 0.5, and set the number of local update epochs E to 10 at each iteration m. Note that without privacy concerns, any classifier and loss function can be plugged into Upcycled-FL. However, if we adopt objective perturbation as privacy protection, the loss function should also satisfy assumptions in Theorem 6.3. We take the cross-entropy loss as our loss function throughout all experiments. To simulate device heterogeneity, we randomly select a fraction of devices to train at each round, and assume there are stragglers that cannot train for full rounds; both devices and stragglers are selected by random seed to ensure they are the same for all algorithms.