Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Federated Learning with Reduced Information Leakage and Computation
Authors: Tongxin Yin, Xuwei Tan, Xueru Zhang, Mohammad Mahdi Khalili, Mingyan Liu
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on both synthetic and real-world data show that the Upcycled-FL strategy can be adapted to many existing FL frameworks and consistently improve the privacy-accuracy trade-off.1 |
| Researcher Affiliation | Academia | Tongxin Yin* EMAIL Department of Electrical and Computer Engineering University of Michigan Xuwei Tan* EMAIL Department of Computer Science and Engineering The Ohio State University Xueru Zhang* EMAIL Department of Computer Science and Engineering The Ohio State University Mohammad Mahdi Khalili EMAIL Department of Computer Science and Engineering The Ohio State University Mingyan Liu EMAIL Department of Electrical and Computer Engineering University of Michigan |
| Pseudocode | Yes | Algorithm 1 Proposed aggregation strategy: Upcycled-FL |
| Open Source Code | Yes | Code available at https://github.com/osu-srml/Upcycled-FL. |
| Open Datasets | Yes | We adopt two real datasets: 1) FEMNIST, a federated version of EMNIST (Cohen et al., 2017). Here, A multilayer perceptron (MLP) consisting of two linear layers with a hidden dimension of 14x14, interconnected by Re LU activation functions, is used to learn from FEMNIST; 2) Sentiment140 (Sent140), a text sentiment analysis task on tweets (Go et al., 2009). |
| Dataset Splits | Yes | Table 2: Details of datasets. Numbers in parentheses represent the amount of test data. All of the numbers round to integer. |
| Hardware Specification | Yes | All experiments are conducted on a server equipped with multiple NVIDIA A5000 GPUs, two AMD EPYC 7313 CPUs, and 256GB memory. |
| Software Dependencies | Yes | The code is implemented with Python 3.8 and Py Torch 1.13.0 on Ubuntu 20.04. |
| Experiment Setup | Yes | We employ SGD as the local optimizer, with a momentum of 0.5, and set the number of local update epochs E to 10 at each iteration m. Note that without privacy concerns, any classifier and loss function can be plugged into Upcycled-FL. However, if we adopt objective perturbation as privacy protection, the loss function should also satisfy assumptions in Theorem 6.3. We take the cross-entropy loss as our loss function throughout all experiments. To simulate device heterogeneity, we randomly select a fraction of devices to train at each round, and assume there are stragglers that cannot train for full rounds; both devices and stragglers are selected by random seed to ensure they are the same for all algorithms. |