Meta-learning with negative learning rates
Authors: Alberto Bernacchia
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the theory by running extensive experiments. |
| Researcher Affiliation | Industry | Alberto Bernacchia Media Tek Research alberto.bernacchia@mtkresearch.com |
| Pseudocode | No | No structured pseudocode or algorithm blocks are explicitly presented. |
| Open Source Code | No | The paper does not provide an unambiguous statement or link for the release of open-source code for the described methodology. |
| Open Datasets | No | The paper describes a generative model for synthetic data (e.g., 'x N (0, Ip)', 'y = x T w + z') and a quadratic function for non-linear regression, but does not provide access information for a publicly available or open dataset. |
| Dataset Splits | Yes | The training set D(i) t = n xt(i) j , yt(i) j o j=1:nt and validation set D(i) v = n xv(i) j , yv(i) j o drawn independently from the same distribution in each task i. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like MAML, neural networks, and stochastic gradient descent but does not provide specific version numbers for any software or libraries. |
| Experiment Setup | Yes | We report results with a network width of 400 in both layers; results were similar with larger network widths. We use the square loss function and we train the neural network in the outer loop with stochastic gradient descent with a learning rate of 0.001 for 5000 epochs (until convergence). We used most parameters identical to section 5.1: nt = 30; nv = 2; nr = 20; m = 3; p = 60; σ = 1, ν = 0.5, w0 = 0. The learning rate for adaptation was set to αr = 0.01. |