reproducibilityindex.ai

Efficient Federated Domain Translation

Authors: Zeyu Zhou, Sheikh Shams Azam, Christopher Brinton, David I. Inouye

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 EXPERIMENTS", "We empirically demonstrate that our Fed INB approach performs significantly better than standard translation models under the practical limited communication setting.", "We test Fed INB on Rotated MNIST and Fashion MNIST (Ghifary et al., 2015). For the federated domain translation experiment, there are 5 clients participating in the training and each has data from one of domains 0, 15, 30, 45, 60", "For the federated domain translation experiment, we use the empirical Wasserstein Distance (WD) and FID score (Heusel et al., 2017) between the original samples and translated samples as evaluation metric.
Researcher Affiliation	Collaboration	Zeyu Zhou, Sheikh Shams Azam , Christopher Brinton, David I. Inouye Elmore Family School of ECE, Purdue University {zhou1059, azam1, cgb, dinouye}@purdue.edu" and "Currently at Apple.
Pseudocode	Yes	Algorithm 1 Fed-multi-max-K-SW", "Algorithm 2 Fed-1D-Barycenter", "Algorithm 3 Federated Iterative Naïve Barycenter
Open Source Code	No	For the network structure (encoder and classifier) and training hyperparameters, we modify based on the default setup in the repository of DIRT (Nguyen et al., 2021), which can be found at their public repository https://github.com/atuannguyen/DIRT. The only difference is that we change Batch Normalization to Instance Normalization." The paper does not provide a link or statement for their own open-source code.
Open Datasets	Yes	Datasets Following the setup in Zhou et al. (2022), we test Fed INB on Rotated MNIST and Fashion MNIST (Ghifary et al., 2015)." and "In the original INB paper, the authors set J = 200 for MNIST (Le Cun & Cortes, 2010) and Fahsion MNSIT (Xiao et al., 2017).
Dataset Splits	No	For training, we use 10,000 samples from the MNIST and Fashion MNIST training set as the dataset of domain 0, where each class has 1000 samples. Then we use all samples to generate 10,000 samples for all other training domains (15,30,45,60). So, the total size of training data is 50,000. For evaluation of Fed INB, we also use 10,000 samples from the MNIST and Fashion MNIST test set, and create other samples in the same way. So the total size of test data is also 50,000." No explicit mention of a validation set or clear train/validation/test split for reproducibility.
Hardware Specification	Yes	Additionally, it is important to note that INB takes around 5 minutes to finish training on a single RTX A5000 GPU while Fed Star GAN takes around 2.5 hours to generate good samples on a single Tesla P100 GPU
Software Dependencies	No	For MNIST, the encoder is composed of [nn.Conv2d(1, 16, 3, padding=1), nn.Re LU(inplace=True), nn.Max Pool2d(2), nn.Conv2d(16, 8, 3, padding=1), nn.Re LU(inplace=True), nn.Max Pool2d(2)] where nn represents torch.nn in Py Torch." This implies PyTorch, but no version number is specified for PyTorch or any other library.
Experiment Setup	Yes	For MNIST, we use 64 as batch size and 0.001 as learning rate. We set the regularization weight of DIRT 10 to be 2. For Fashion MNIST, we use 128 as batch size and 0.0001 as learning rate. We set the regularization weight of DIRT to be 10. For Rotated MNIST and all test domains, we run Fed DIRT/Fed Avg (1-batch) for 2000 iterations, Fed DIRT/Fed Avg (10-batch) for 2500 iterations and Fed DIRT/Fed Avg (100-batch) for 3000 iterations." Also details on network structures in E.1.