Meta-learning Symmetries by Reparameterization
Authors: Allan Zhou, Tom Knowles, Chelsea Finn
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments suggest that it can automatically learn to encode equivariances to common transformations used in image processing tasks. Our experiments show that meta-learning can recover various convolutional architectures from data, and learn invariances to common data augmentation transformations. |
| Researcher Affiliation | Academia | Allan Zhou, Tom Knowles, Chelsea Finn Dept of Computer Science, Stanford University {ayz,tknowles,cbfinn}@stanford.edu |
| Pseudocode | Yes | Algorithm 1: MSR: Meta-Training. Algorithm 2: Augmentation Meta-Training. |
| Open Source Code | Yes | We provide our experiment code at https: //github.com/Allan Yang Zhou/metalearning-symmetries. |
| Open Datasets | Yes | We apply this augmentation strategy to Omniglot (Lake et al., 2015) and Mini Imagenet (Vinyals et al., 2016) few shot classification to create the Aug-Omniglot and Aug-Mini Imagenet benchmarks. |
| Dataset Splits | Yes | MAML and MSR split each task s examples into a support set (task training data) and a query set (task validation data). Table 5: Synthetic problem data quantity. |
| Hardware Specification | Yes | We ran all experiments on a single machine with a single NVidia RTX 2080Ti GPU. We ran all experiments on a machine with a single NVidia Titan RTX GPU. |
| Software Dependencies | No | The paper mentions using 'Py Torch', the 'Higher' library (Grefenstette et al., 2019), and the 'torchvision' library, but it does not provide specific version numbers for any of these software components. |
| Experiment Setup | Yes | During meta-training we trained each method for 1, 000 outer steps on task batches of size 32... We used the Adam (Kingma and Ba, 2014) optimizer in the outer loop with learning rate .0005... On training tasks MAML and MSR used 3 SGD steps on the support data... We also used meta-learned per-layer learning rates initialized to 0.02. For all experiments and gradient based methods we trained for 60, 000 (outer) steps using the Adam optimizer with learning rate .0005 for Mini Imagenet 5-shot and .001 for all other experiments. In the inner loop we used SGD with meta-learned per-layer learning rates initialized to 0.4 for Omniglot and .05 for Mini Imagenet. We meta-trained using a single inner loop step in all experiments, and used 3 inner loop steps at meta-test time. During meta-training we used a task batch size of 32 for Omniglot and 10 for Mini Imagenet. |