Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning
Authors: Bokun Wang, Zhuoning Yuan, Yiming Ying, Tianbao Yang
JMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the performance of our proposed algorithms, MOMLv1, MOMLv2 and Local MOML, on sinewave regression and one-shot classification tasks in the single-node setting. Furthermore, we demonstrate the effectiveness of Local MOML in the simulated federated learning setting for the image classification task. |
| Researcher Affiliation | Academia | Bokun Wang EMAIL Department of Computer Science and Engineering Texas A&M University College Station, TX 77843, USA Zhuoning Yuan EMAIL Department of Computer Science The University of Iowa Iowa City, IA 52242, USA Yiming Ying EMAIL Department of Mathematics and Statistics University at Albany Albany, NY 12222, USA Tianbao Yang EMAIL Department of Computer Science and Engineering Texas A&M University College Station, TX 77843, USA |
| Pseudocode | Yes | Algorithm 1 MOMLv1, Algorithm 2 MOMLv2, Algorithm 3 Local MOML |
| Open Source Code | Yes | Interested readers can access our code at https://github.com/bokun-wang/moml. |
| Open Datasets | Yes | We evaluate the performance of our proposed algorithms, MOMLv1, MOMLv2 and Local MOML, on sinewave regression and one-shot classification tasks in the single-node setting... on the sinewave regression problem (Finn et al., 2017)... on the Omniglot and CIFAR-100 datasets... We consider three data sets, MNIST, CIFAR-10, and CIFAR-100. |
| Dataset Splits | Yes | We generate 25 tasks for training in total... The training and validation data of each task are randomly sampled in an online manner... adapted to 5 randomly sampled unseen tasks... 10 test data points... another 100 data points from the unseen task... For the Omniglot dataset, we randomly select 25 tasks for training and 10 tasks for testing... For the CIFAR-100 dataset, we randomly select 17 tasks for training and 3 tasks for testing... we distribute the training data between N = 50 clients (tasks)... Similarly, we divide the test data among the clients with the same distribution as the one for the training data. We set a = 68 for constructing the distributed training sets of MNIST, CIFAR-10, and CIFAR-100, and set a = 34 for constructing the test sets of MNIST and CIFAR-10 and a = 15 for constructing the test sets of CIFAR-100. |
| Hardware Specification | No | We conduct experiments on four GPUs to mimic the cross-device federated learning setting, where all 50 tasks are distributed to the four GPUs roughly evenly. The paper mentions using 'four GPUs' but does not specify the exact GPU models, memory, or any other hardware details. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | The inner step size α is set to 0.01 for all algorithms. The outer step size η is decayed 10 times at 75% of the total iterations and its initial value is tuned for the algorithms separately by grid search in {0.1, 0.05, 0.01, 0.005, 0.001}. We also tune β for MOMLv1, MOMLv2 and Local MOML. It turns out that β = 0.3 and β = 0.5 work reasonably well for MOMLv1 and Local MOML while β = 0.1 is good for MOMLv2. For Local MOML, we set the size of the initial number of samples K0 of each round to be 2 times K and H = 5. We use α = 0.001 and the step size for the considered algorithms is tuned in a range similar to before. For all algorithms, we consider two settings of H = 4 and H = 10. The minibatch size at every iteration (including the initial one at each round) is set to 5, that is, K = 5, K0 = 5. We tune the β in a range [0.1, 0.9], and run a total of 10000 iterations. For p Fed Me, we tune its hyperparameter λ = 100 and set the number of steps to be 50 to solve the sub-problem accurately enough. |