Generalization Bounds For Meta-Learning: An Information-Theoretic Analysis
Authors: Qi CHEN, Changjian Shui, Mario Marchand
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate Theorem 6.2 on both synthetic and real data. The numerical results demonstrate that, in most situations, the gradient incoherence based bound is orders of magnitude tighter than the conventional meta learning bounds with the Lipschitz assumption, which is estimated with gradient norms.6 |
| Researcher Affiliation | Academia | Qi Chen Université Laval Changjian Shui Université Laval Mario Marchand Université Laval |
| Pseudocode | No | The paper describes the algorithms (SGLD, Meta-SGLD) and their update rules using textual descriptions and mathematical equations, but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Code is available at: https://github.com/livreQ/meta-sgld. |
| Open Datasets | Yes | To evaluate the proposed bound in modern deep few-shot learning scenarios, we have tested the Meta-SGLD algorithm on the Omniglot dataset [44]. |
| Dataset Splits | Yes | We evaluate on three different few-shot settings with mva = {1, 8, 15} and the corresponding train size mtr = {15, 8, 1}. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory capacity) used to conduct the experiments. |
| Software Dependencies | No | The paper mentions 'MAML-Pytorch implementation' and 'meta-learning-adjusting-priors' as references, and indicates code availability, but does not provide specific version numbers for software dependencies such as PyTorch, Python, or other libraries used in the implementation. |
| Experiment Setup | Yes | We evaluate on three different few-shot settings with mva = {1, 8, 15} and the corresponding train size mtr = {15, 8, 1}. At each iteration t, we randomly choose a subset of 5 tasks (|It| = 5) from the whole data set. A train task consists of five classes (characters) randomly chosen from the first 1200 characters, each class has m = 16 samples selected from the 20 instances. At each epoch, we have trained the model with |It| = 32 tasks. The Meta-SGLD algorithm has a nested loop structure: the outer loop includes T iterations of SGLD for updating the meta-parameters U; at each outer loop iteration t [T], there exists several parallel inner loops, where each loop is a K-iteration SGLD to update different task-specific parameters Wi. if we set σt = p2ηt/γt, σt,k = p2βt,k/γt,k, where γt and γt,k are the inverse temperatures. |