Adaptive Gradient-Based Meta-Learning Methods
Authors: Mikhail Khodak, Maria-Florina F. Balcan, Ameet S. Talwalkar
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 Empirical Results: Adaptive Methods for Few-Shot & Federated Learning |
| Researcher Affiliation | Collaboration | Mikhail Khodak Carnegie Mellon University khodak@cmu.edu Maria-Florina Balcan Carnegie Mellon University ninamf@cs.cmu.edu Ameet Talwalkar Carnegie Mellon University & Determined AI talwalkar@cmu.edu |
| Pseudocode | Yes | Algorithm 1: Generic online algorithm for gradient-based parameter-transfer meta-learning. and Algorithm 2: ARUBA: an approach for modifying a generic batch GBML method to learn a per-coordinate learning rate. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | 20-way Omniglot [37] 5-way Mini-Image Net [46] and Shakespeare dataset [12]. These are standard, publicly available datasets with proper citations. |
| Dataset Splits | No | The paper mentions using standard benchmarks like Omniglot and Mini-Image Net, but it does not explicitly specify train/validation/test split percentages, sample counts, or refer to a specific publication detailing the splits for these experiments. |
| Hardware Specification | No | The paper does not specify any particular CPU or GPU models, or other hardware components used for running the experiments. |
| Software Dependencies | No | The paper mentions algorithms like Adam [36] and Fed Avg [41] but does not list any specific software dependencies or libraries with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions) required for reproduction. |
| Experiment Setup | Yes | Algorithm 2: ARUBA: an approach for modifying a generic batch GBML method to learn a per-coordinate learning rate. Input: T tasks, update method for meta-initialization, within-task descent method, settings ε, ζ, p > 0 Initialize b1 ε21d, g1 ζ21d |