Theoretical bounds on estimation error for meta-learning
Authors: James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI, Toniann Pitassi, Richard Zemel
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our primary contributions can be summarized as follows: We introduce novel lower bounds on minimax risk of parameter estimation in meta-learning. Through these bounds, we compare the relative utility of samples from meta-training tasks and the novel task and emphasize the importance of the relationship between the tasks. We provide novel upper bounds on the error rate for estimation in a hierarchical meta-linear-regression problem, which we verify through an empirical evaluation. |
| Researcher Affiliation | Academia | No clear institutional affiliations (university names, company names, or email domains) are provided in the extracted text to allow for classification of author affiliation types. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code to reproduce these plots is provided in the supplementary materials with our submission. |
| Open Datasets | No | The paper describes generating synthetic data for 'polynomial regression over inputs in the range [-1, 1]' and 'sinusoid functions by placing a prior over the amplitude and phase'. It does not provide concrete access information (link, DOI, formal citation) for a publicly available or open dataset, but rather describes a data generation process. |
| Dataset Splits | No | The paper refers to 'n number of datapoints at training tasks (support set)' and 'k number of datapoints at testing tasks (support set)', as well as 'nq number of datapoints at training tasks (query set)' and 'kq number of datapoints at testing tasks (query set)'. While it defines the amount of data used for training and testing a meta-learner per task, it does not specify explicit train/validation/test dataset splits (e.g., percentages, fixed sample counts) for a static dataset, as the data is generated synthetically. |
| Hardware Specification | No | This experiment therefore lasted 20 hours in total. M = 50, n {20, 200}, k {100, 1000}, σ [10 8, 1.5], Mq = 100, eps per batch = 25, train ampl range = [1, 4], train phase range = [0, π/2], val ampl range = [3, 5], val phase range = [0, π/2], inner steps = 5, inner lr = 10 3, meta lr = 10 3 |
| Software Dependencies | No | The paper mentions using 'MAML algorithm', 'SGD', and 'Adam' for optimization, but does not specify any software libraries or frameworks with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | For all of these experiments we used a fully connected network with 6 layers and 40 hidden units per layer. The network is trained using the MAML algorithm (Finn et al., 2017) with 5 inner steps using SGD with an inner learning rate of 10^-3. We used Adam for the outer loop learning with a learning rate of 10^-3. ... Hyper parameters Description: σ noise at test time. M number of tasks at the training tasks Mq number of tasks at the testing tasks eps per batch episode per batch train ampl range range of amplitude at training train phase range range of phase at training val ampl range range of amplitude at testing val phase range range of phase at testing inner steps number of steps of Maml inner lr learning rate used to optimize parameter of the model meta lr used to optimize parameter of the meta-learner n number of datapoints at training tasks(support set) k number of datapoints at testing tasks (support set) nq number of datapoints at training tasks (query set) . kq number of datapoints at testing tasks (query set). |