reproducibilityindex.ai

Recasting Gradient-Based Meta-Learning as Hierarchical Bayes

Authors: Erin Grant, Chelsea Finn, Sergey Levine, Trevor Darrell, Thomas Griffiths

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 EXPERIMENTAL EVALUATION The goal of our experiments is to evaluate if we can use our probabilistic interpretation of MAML to generate samples from the distribution over adapted parameters, and futhermore, if our method can be applied to large-scale meta-learning problems such as mini Image Net. Table 1: One-shot classiﬁcation performance on the mini Image Net test set, with comparison methods ordered by one-shot performance. All results are averaged over 600 test episodes, and we report 95% conﬁdence intervals.
Researcher Affiliation	Academia	1 Berkeley AI Research (BAIR), University of California, Berkeley 2 Department of Electrical Engineering & Computer Sciences, University of California, Berkeley 3 Department of Psychology, University of California, Berkeley
Pseudocode	Yes	Algorithm 2: Model-agnostic meta-learning as hierarchical Bayesian inference. Subroutine 3: Subroutine for computing a point estimate ˆφ using truncated gradient descent to approximate the marginal negative log likelihood (NLL). Subroutine 4: Subroutine for computing a Laplace approximation of the marginal likelihood.
Open Source Code	No	The paper does not provide any explicit statement or link for open-source code for the described methodology.
Open Datasets	Yes	We evaluate LLAMA on the mini Image Net Ravi & Larochelle (2017) 1-shot, 5-way classiﬁcation task, a standard benchmark in few-shot classiﬁcation.
Dataset Splits	Yes	mini Image Net comprises 64 training classes, 12 validation classes, and 24 test classes. During training and for each task, 10 input datapoints are sampled uniformly from [ 10.0, 10.0] and the loss is the mean squared error between the prediction and the true value.
Hardware Specification	Yes	In particular, our Tensor Flow implementation of LLAMA trains for 60,000 iterations on one TITAN Xp GPU in 9 hours, compared to 5 hours to train MAML.
Software Dependencies	No	The paper mentions "Tensor Flow implementation" but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup	Yes	We use Adam (Kingma & Ba, 2014) as the meta-optimizer, and standard batch gradient descent with a ﬁxed learning rate to update the model during fast adaptation. LLAMA requires the prior precision term τ as well as an additional parameter η R+ that weights the regularization term log det ˆH contributed by the Laplace approximation. We ﬁx τ = 0.001 and selected η = 10 6 via cross-validation; all other parameters are set to the values reported in Finn et al. (2017).