One-Pass Distribution Sketch for Measuring Data Heterogeneity in Federated Learning
Authors: Zichang Liu, Zhaozhuo Xu, Benjamin Coleman, Anshumali Shrivastava
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct empirical evaluations to answer three questions: (1) Does the one-pass sketch distance reflect the differences between distributions? (2) Does the sketch distance help convergence in FL, and (3) Does the sketch distance retrieve the best-personalized models? To answer these three questions, we conducted three sets of experiments. |
| Researcher Affiliation | Collaboration | Zichang Liu Rice University zichangliu@rice.edu Zhaozhuo Xu Stevens Institute of Technology zxu79@stevens.edu Benjamin Coleman Rice University benjamin.ray.coleman@gmail.com Anshumali Shrivastava Rice University & Third AI Corp. anshumali@rice.edu ... Now with Google Deep Mind. |
| Pseudocode | Yes | Algorithm 1 One-Pass Distribution Sketch |
| Open Source Code | Yes | Code is available at https://github.com/lzcemma/RACE_Distance |
| Open Datasets | Yes | Dataset: We evaluate Algorithm 3 and Algorithm 2 on both vision and language datasets. For visual classification, we use the MNIST dataset [51] and FEMNIST [52]. ... We also use the Shakespeare next-character prediction dataset [6] for language-based FL. |
| Dataset Splits | No | The paper mentions 'train' and 'test' sets but does not explicitly provide details about a 'validation' set or split. |
| Hardware Specification | Yes | Our FL codebase, including FL workflow, LSH functions, and proposed algorithms, is implemented on Py Torch [55]. We test Algorithm 3 and Algorithm 2 on a server with 8 Nvidia Tesla V100 GPU and a 48-core/96-thread processor (Intel(R) Xeon(R) Gold 5220R CPU @ 2.20GHz). |
| Software Dependencies | No | Our FL codebase, including FL workflow, LSH functions, and proposed algorithms, is implemented on Py Torch [55]. |
| Experiment Setup | Yes | For the MNIST dataset (both MNIST and MNIST Uniform + Direchlet), both Algorithm 3 and Fedavg are trained by 200 rounds. In each round, K = 3 clients are selected from L active clients. Next, each client is trained for 20 epochs with batch size 32 and learning rate η = 0.0001. |