Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FedMix: Approximation of Mixup under Mean Augmented Federated Learning

Authors: Tehrim Yoon, Sumin Shin, Sung Ju Hwang, Eunho Yang

ICLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test our result on various benchmark datasets with Naive Mix (direct mixup between local data and averaged data) and Fed Mix, then compare the results with Fed Avg (Mc Mahan et al., 2017) and Fed Prox (Li et al., 2020b), as well as other baseline Mixup scenarios. We create a highly non-iid environment to show our methods excel in such situations.
Researcher Affiliation	Academia	Tehrim Yoon & Sumin Shin & Sung Ju Hwang & Eunho Yang Korea Advanced Institute of Science and Technology (KAIST) Daejeon, South Korea EMAIL
Pseudocode	Yes	Algorithm 1: Mean Augmented Federated Learning (MAFL); Algorithm 2: Fed Mix Local Update(k, wt; Xg, Yg) under MAFL (Algorithm 1); Algorithm 3: Fed Avg
Open Source Code	No	The paper does not provide any explicit statement or link to open-source code for the described methodology.
Open Datasets	Yes	We used three popular image classification benchmark datasets: FEMNIST (Caldas et al., 2019), CIFAR10, and CIFAR100, as well as a popular natural language processing benchmark dataset, Shakespeare.
Dataset Splits	Yes	CIFAR10 and CIFAR100 are very popular and simple image classification datasets for federated setting. Both contain 50,000 training data and 10,000 test data. ... No validation data was split and we used all training data for local training.
Hardware Specification	No	The paper mentions GPU memory allocation (e.g., 'Fed Avg requires 46.00MB to allocate, Local Mix requires 94.00MB and 98.00MB for Fed Mix') but does not specify any particular GPU models, CPU models, or detailed hardware specifications used for the experiments.
Software Dependencies	No	The paper mentions using SGD optimizer and various network architectures (Le Net-5, VGG, LSTM) but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks.
Experiment Setup	Yes	Table 8: Hyperparameter settings for each dataset. This table includes specific values for local epochs (E), local batch size, class per clients, fraction of Clients (K/N), λ (for Naive Mix), λ (for Fed Mix), and µ (for Fed Prox).