Meta-learning with negative learning rates

Authors: Alberto Bernacchia

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the theory by running extensive experiments.
Researcher Affiliation Industry Alberto Bernacchia Media Tek Research alberto.bernacchia@mtkresearch.com
Pseudocode No No structured pseudocode or algorithm blocks are explicitly presented.
Open Source Code No The paper does not provide an unambiguous statement or link for the release of open-source code for the described methodology.
Open Datasets No The paper describes a generative model for synthetic data (e.g., 'x N (0, Ip)', 'y = x T w + z') and a quadratic function for non-linear regression, but does not provide access information for a publicly available or open dataset.
Dataset Splits Yes The training set D(i) t = n xt(i) j , yt(i) j o j=1:nt and validation set D(i) v = n xv(i) j , yv(i) j o drawn independently from the same distribution in each task i.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software components like MAML, neural networks, and stochastic gradient descent but does not provide specific version numbers for any software or libraries.
Experiment Setup Yes We report results with a network width of 400 in both layers; results were similar with larger network widths. We use the square loss function and we train the neural network in the outer loop with stochastic gradient descent with a learning rate of 0.001 for 5000 epochs (until convergence). We used most parameters identical to section 5.1: nt = 30; nv = 2; nr = 20; m = 3; p = 60; σ = 1, ν = 0.5, w0 = 0. The learning rate for adaptation was set to αr = 0.01.