Gradient-EM Bayesian Meta-Learning

Authors: Yayi Zou, Xiaoqi Lu

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on sinusoidal regression, few-shot image classification, and policy-based reinforcement learning show that our method not only achieves better accuracy with less computation cost, but is also more robust to uncertainty.
Researcher Affiliation Collaboration Yayi Zou Didi AI Labs @Silicon Valley yz725@cornell.edu Xiaoqi Lu Columbia University lx2170@columbia.edu
Pseudocode Yes Algorithm 1: Extended Empirical Bayes Meta-learning Framework.
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes Experiments on sinusoidal regression, few-shot image classification, and policy-based reinforcement learning... Omniglot dataset and Mini Imagenet dataset which are popular few-shot learning benchmarks... We test and compare the models on the same 2D Navigation and Mu Jo Co continuous control tasks as are used in [3].
Dataset Splits Yes The objective is to come up with an algorithm that produces a decision rule fτ that minimize the expected test loss over all tasks Eτ P (τ)lτ. ... Dtrτ is firstly provided. We are then required to return fτ based on {Dtr τ S Dval τ : τ T meta-train} Dtrτ and evaluate its expected loss lτ = EDτ τ L(Dτ , fτ ) on more samples generated from that task. ... For each task τ of this set, we collect K samples rollout of current policy f denoted as Dtr τ and another K samples rollout after 1 policy gradient training of f denoted as Dval τ (f is not needed in generating samples in supervised learning).
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions optimizers like Adam and TRPO, but does not provide specific version numbers for any software dependencies or libraries used.
Experiment Setup Yes Data of each task is generated from y = A sin(wx + b) + ϵ with amplitude A, frequency w, and phase b as task parameter and observation noise ϵ. Task parameters are sampled from uniform distributions A [0.1, 5.0], b [0.0, 2π], w [0.5, 2.0] and observation noise follows ϵ N(0, (0.01A)2). x ranges from [ 5.0, 5.0]. For each task, K = 10 observations({xi, yi} pairs) are given. The underlying network architecture(2 hidden layers of size 40 with RELU activation) is the same as [3] to make a fair comparison.