ES-MAML: Simple Hessian-Free Meta Learning
Authors: Xingyou Song, Wenbo Gao, Yuxiang Yang, Krzysztof Choromanski, Aldo Pacchiano, Yunhao Tang
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 4, we present numerical experiments, highlighting the topics of exploration (Section 4.1), the utility of compact architectures (Section 4.2), the stability of deterministic policies (Section 4.3), and comparisons against existing MAML algorithms in the few-shot regime (Section 4.4). We show empirically that ES-MAML is competitive with existing methods and often yields better adaptation with fewer queries. |
| Researcher Affiliation | Collaboration | Xingyou Song , Yuxiang Yang , Krzysztof Choromanski Google Brain {xingyousong,yxyang,kchoro}@google.com Aldo Pacchiano UC Berkeley pacchiano@berkeley.edu Wenbo Gao , Yunhao Tang Columbia University {wg2279,yt2541}@columbia.edu |
| Pseudocode | Yes | Algorithm 1: Monte Carlo ES Gradient |
| Open Source Code | No | The paper does not provide any explicit statements about open-sourcing code or links to a code repository for the described methodology. |
| Open Datasets | Yes | Navigation-2D (Finn et al., 2017) is a classic environment where the agent must explore to adapt to the task. |
| Dataset Splits | No | The paper mentions 'Train Set Size' and 'Test Set Size' in the hyperparameter table, but does not provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or explicit splitting methodology) for reproduction. |
| Hardware Specification | No | The paper mentions that ES-MAML 'can be parallelized over CPUs' but does not provide specific hardware details like CPU models, GPU types, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper mentions 'Tensorflow' but does not provide specific version numbers for it or any other software dependencies. |
| Experiment Setup | Yes | For standard ES-MAML (Algorithm 3), we used the following settings. Setting Value (Total Workers, # Perturbations, # Current Evals) (300, 150, 150) (Train Set Size, Task Batch Size, Test Set Size) (50,5,5) or (N,N,N) Number of rollouts per parameter 1 Number of Perturbations per worker 1 Outer-Loop Precision Parameter 0.1 Adaptation Precision Parameter 0.1 Outer-Loop Step Size 0.01 Adaptation Step Size (α) 0.05 Hidden Layer Width 32 ES Estimation Type Forward-FD Reward Normalization True State Normalization True |