reproducibilityindex.ai

Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks

Authors: Micah Goldblum, Steven Reich, Liam Fowl, Renkun Ni, Valeriia Cherepanova, Tom Goldstein

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We develop a better understanding of the underlying mechanics of meta-learning and the difference between models trained using meta-learning and models which are trained classically. In doing so, we introduce and verify several hypotheses for why meta-learned models perform better. Furthermore, we develop a regularizer which boosts the performance of standard training routines for few-shot classiﬁcation. In many cases, our routine outperforms metalearning while simultaneously running an order of magnitude faster. In Table 1, we test the performance of meta-learned feature extractors not only with their own ﬁne-tuning algorithm, but with a variety of ﬁne-tuning algorithms. We ﬁnd that in all cases, the meta-learned feature extractors outperform classically trained models of the same architecture.
Researcher Affiliation	Academia	1University of Maryland, College Park.
Pseudocode	Yes	Algorithm 1 The meta-learning framework. Algorithm 2 Reptile with Weight-Clustering Regularization.
Open Source Code	Yes	A Py Torch implementation of the feature clustering and hyperplane variation regularizers can be found at: https://github.com/goldblum/FeatureClustering
Open Datasets	Yes	We focus our attention on two datasets: mini-Image Net and CIFAR-FS. Mini-Image Net is a pruned and downsized version of the Image Net classiﬁcation dataset, consisting of 60,000, 84 84 RGB color images from 100 classes (Vinyals et al., 2016). The CIFAR-FS dataset samples images from CIFAR-100 (Bertinetto et al., 2018).
Dataset Splits	Yes	These 100 classes are split into 64, 16, and 20 classes for training, validation, and testing sets, respectively. CIFAR-FS is split in the same way as mini-Image Net with 60,000 32 32 RGB color images from 100 classes divided into 64, 16, and 20 classes for training, validation, and testing sets, respectively.
Hardware Specification	No	The paper does not provide specific details on the hardware (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'A Py Torch implementation' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	We incorporate this regularizer into a standard training routine by sampling two images per class in each mini-batch so that we can compute a within-class variance estimate. Then, the total loss function becomes the sum of cross-entropy and RF C. See Appendix A.2 for experimental details including training times. Experimental details, as well as results for other values of this coefﬁcient, can be found in Appendix A.3.