Meta-Curvature

Authors: Eunbyung Park, Junier B. Oliva

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effects of our proposed method on several few-shot learning tasks and datasets. Without any task specific techniques and architectures, the proposed method achieves substantial improvement upon previous MAML variants and outperforms the recent state-of-the-art methods. Furthermore, we observe faster convergence rates of the meta-training process. Finally, we present an analysis that explains better generalization performance with the meta-trained curvature.
Researcher Affiliation Academia Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill eunbyung@cs.unc.edu Junier B. Oliva Department of Computer Science University of North Carolina at Chapel Hill joliva@cs.unc.edu
Pseudocode Yes We provide the details of algorithm in appendices.
Open Source Code Yes The code is available at https://github.com/silverbottlep/meta_curvature
Open Datasets Yes We evaluated our methods on few-shot regression and few-shot classification tasks over Omniglot [19], mini Imagenet [44], and tiered Imagnet [35] datasets.
Dataset Splits Yes The mini Imagenet dataset was proposed by [44, 34] and it consists of 100 subclasses out of 1000 classes in the original dataset (64 training classes, 12 validation classes, 24 test classes).
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments.
Software Dependencies No The paper mentions using the ADAM optimizer but does not specify its version or the versions of any other software libraries or programming languages used.
Experiment Setup Yes The network architecture and all hyperparameters are same as [9] and we only introduce the suggested meta-curvature. We follow the experimental protocol in [9] and all hyperparameters and network architecture are same as [9]. We used 4 layers convolutional neural network with the batch normalization followed by a fully connected layer for the final classification. To avoid overfitting, we applied data augmentation techniques suggested in [5, 6].