DISTREAL: Distributed Resource-Aware Learning in Heterogeneous Systems
Authors: Martin Rapp, Ramin Khalili, Kilian Pfeiffer, Jörg Henkel8062-8071
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We implement our solution DISTREAL in an FL system, in which the availability of computational resources varies both between devices and over time. We show through extensive evaluation that DISTREAL significantly increases the convergence speed over the state of the art, and is robust to the rapid changes in resource availability at devices, without compromising on the final accuracy. This section demonstrates the benefits of DISTREAL with heterogeneous resource availability in an FL system. Experimental Setup We study synchronous FL as described in the system model. We report the classification accuracy of the synchronized model at the end of each round. |
| Researcher Affiliation | Collaboration | Martin Rapp1, Ramin Khalili2, Kilian Pfeiffer1, J org Henkel1 1 Karlsruhe Institute of Technology, Karlsruhe, Germany 2 Huawei Research Center, Munich, Germany |
| Pseudocode | Yes | Algorithm 1: Each Selected Device i (Client) and Algorithm 2: Server |
| Open Source Code | Yes | using an implementation of this structured dropout in Py Torch (Paszke et al. 2019), which is publicly available3. 3https://git.scc.kit.edu/CES/DISTREAL |
| Open Datasets | Yes | The three datasets used in our experiments are Federated Extended MNIST (FEMNIST) (Cohen et al. 2017) with non-independently and identically distributed (noniid) split data, similar to LEAF (Caldas et al. 2019), and CIFAR-10/100 (Krizhevsky and Hinton 2009). |
| Dataset Splits | No | FEMNIST consists of 641,828 training and 160,129 test examples, each a 28 28 grayscale image of one out of 62 classes (10 digits, 26 upperand 26 lower-case letters). CIFAR-10 consists of 50,000 training and 10,000 test examples, each a 32 32 RGB image of one out of 10 classes such as airplane or frogs. CIFAR-100 is similar to CIFAR-10 but uses 100 classes. The paper specifies train and test set sizes but does not explicitly mention a validation set split for the main experiments. |
| Hardware Specification | Yes | We also report the training time of a single mini-batch on a Raspberry Pi 4, which serves as an example for an Io T device... The DSE for these NNs takes around 15, 270, and 330 compute-hours, respectively, on a system with an Intel Core i5-4570 and an NVIDIA Ge Force GTX 980. |
| Software Dependencies | No | The paper mentions software like Py Torch and the pygmo2 library, but does not provide specific version numbers for these software dependencies (e.g., 'Py Torch (Paszke et al. 2019)' without a version number). |
| Experiment Setup | Yes | For FEMNIST, we use a similar network as used in Federated Dropout (Caldas et al. 2018), with a depth of 4 layers... We use Dense Net (Huang et al. 2017) for CIFAR-10 and CIFAR100 with growth rate k = 12 and depth of 40 and 100, respectively... We train for 64 mini-batches with batch size 64... We use a population size of 64... We use s=0.7, as it shows the best performance... We study four different values of λ {0.5, 1, 2, 4}. |