Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Aequa: Fair Model Rewards in Collaborative Learning via Slimmable Networks
Authors: Nurbek Tastan, Samuel Horváth, Karthik Nandakumar
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We theoretically study the convergence of our proposed approach and empirically validate it using extensive experiments on different datasets and architectures. We also extend our approach to enable training-time model reward allocation. The code can be found at https://github.com/tnurbek/aequa. |
| Researcher Affiliation | Academia | 1Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE 2Michigan State University (MSU), Michigan, USA. Correspondence to: Nurbek Tastan <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Aequa: Federated optimization Algorithm 2 Aequa (with training-time model rewards) |
| Open Source Code | Yes | The code can be found at https://github.com/tnurbek/aequa. |
| Open Datasets | Yes | We use the following datasets to carry out our experiments (following (Li et al., 2020)): MNIST (Le Cun, 1998), Fashion-MNIST (FMNIST) (Xiao et al., 2017), SVHN (Netzer et al., 2011), CIFAR-10 & CIFAR-100 (Krizhevsky et al., 2009), Stanford Sentiment Treebank (SST) (Socher et al., 2013), and the federated handwriting dataset FEMNIST (Caldas et al., 2019). |
| Dataset Splits | Yes | The other datasets are partitioned using the following strategies: (i) homogeneous, where each participant gets an equal number of data points per class; (ii) heterogeneous, where each client gets a varying number of data points per class based on a Dirichlet(α) distribution (concentration parameter α reflects the degree of non-i.i.d. characteristics within the dataset); (iii) quantity skew allocates κ proportion of total data points to each of the m selected participants and the remaining N m participants split the remaining data equally; (iv) label skew, denoted by #C = m, creates a label imbalance by sampling m classes for each client and then randomly distributing samples from class m among selected participants. |
| Hardware Specification | Yes | All experiments were carried out on NVIDIA A100-SXM4-40GB GPUs, with each run utilizing a single GPU. |
| Software Dependencies | No | The paper mentions "SGD with momentum" and "learning rate scheduler" as part of the implementation details, but does not specify software dependencies like specific library names with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | We use cross-entropy loss for all image and language classification tasks and maintain consistent training hyperparameters across all experiments. The optimizer of choice is SGD with momentum, with a default initial learning rate of 0.01. A learning rate scheduler is applied, reducing the learning rate by a factor of 0.1 at rounds 50 and 75, when the total number of communication rounds is set to 100. The total number of communications is set as follows: CIFAR-10, CIFAR-100, and SST: T = 100, MNIST, FMNIST, and SVHN: T = 50. In each round, clients perform one local epoch of training. The batch size is fixed at 128 across all experiments. |