Gradient-EM Bayesian Meta-Learning
Authors: Yayi Zou, Xiaoqi Lu
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on sinusoidal regression, few-shot image classification, and policy-based reinforcement learning show that our method not only achieves better accuracy with less computation cost, but is also more robust to uncertainty. |
| Researcher Affiliation | Collaboration | Yayi Zou Didi AI Labs @Silicon Valley yz725@cornell.edu Xiaoqi Lu Columbia University lx2170@columbia.edu |
| Pseudocode | Yes | Algorithm 1: Extended Empirical Bayes Meta-learning Framework. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Experiments on sinusoidal regression, few-shot image classification, and policy-based reinforcement learning... Omniglot dataset and Mini Imagenet dataset which are popular few-shot learning benchmarks... We test and compare the models on the same 2D Navigation and Mu Jo Co continuous control tasks as are used in [3]. |
| Dataset Splits | Yes | The objective is to come up with an algorithm that produces a decision rule fτ that minimize the expected test loss over all tasks Eτ P (τ)lτ. ... Dtrτ is firstly provided. We are then required to return fτ based on {Dtr τ S Dval τ : τ T meta-train} Dtrτ and evaluate its expected loss lτ = EDτ τ L(Dτ , fτ ) on more samples generated from that task. ... For each task τ of this set, we collect K samples rollout of current policy f denoted as Dtr τ and another K samples rollout after 1 policy gradient training of f denoted as Dval τ (f is not needed in generating samples in supervised learning). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions optimizers like Adam and TRPO, but does not provide specific version numbers for any software dependencies or libraries used. |
| Experiment Setup | Yes | Data of each task is generated from y = A sin(wx + b) + ϵ with amplitude A, frequency w, and phase b as task parameter and observation noise ϵ. Task parameters are sampled from uniform distributions A [0.1, 5.0], b [0.0, 2π], w [0.5, 2.0] and observation noise follows ϵ N(0, (0.01A)2). x ranges from [ 5.0, 5.0]. For each task, K = 10 observations({xi, yi} pairs) are given. The underlying network architecture(2 hidden layers of size 40 with RELU activation) is the same as [3] to make a fair comparison. |