Modular Meta-Learning with Shrinkage

Authors: Yutian Chen, Abram L. Friesen, Feryal Behbahani, Arnaud Doucet, David Budden, Matthew Hoffman, Nando de Freitas

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate that our method discovers a small set of meaningful task-specific modules and outperforms existing metalearning approaches in domains like few-shot text-to-speech that have little task data and long adaptation horizons.
Researcher Affiliation Industry Deep Mind London, UK {yutianc, abef}@google.com
Pseudocode Yes Figure 1: (Left) Structure of a typical meta-learning algorithm. (Right) Bayesian shrinkage graphical model. The shared meta parameters φ serve as the initialization of the neural network parameters for each task θt. The σ are shrinkage parameters. By learning these, the model automatically decides which subsets of parameters (i.e., modules) to fix for all tasks and which to adapt at test time.
Open Source Code No The paper does not explicitly provide an unambiguous statement or link to its own open-source code for the methodology described.
Open Datasets Yes We use the augmented Omniglot protocol of Flennerhag et al. [4], which necessitates long-horizon adaptation.
Dataset Splits Yes We use 30 training alphabets (T = 30), 15 training images (K = 15), and 5 validation images per class.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instance specifications).
Software Dependencies No The paper mentions using specific algorithms like 'conjugate gradient algorithm' and 'Adam [46]', but does not provide specific software names with version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.0').
Experiment Setup Yes Following Flennerhag et al. [4], we use a 4-layer convnet and perform 100 steps of task adaptation.