Meta-Curvature
Authors: Eunbyung Park, Junier B. Oliva
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effects of our proposed method on several few-shot learning tasks and datasets. Without any task specific techniques and architectures, the proposed method achieves substantial improvement upon previous MAML variants and outperforms the recent state-of-the-art methods. Furthermore, we observe faster convergence rates of the meta-training process. Finally, we present an analysis that explains better generalization performance with the meta-trained curvature. |
| Researcher Affiliation | Academia | Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill eunbyung@cs.unc.edu Junier B. Oliva Department of Computer Science University of North Carolina at Chapel Hill joliva@cs.unc.edu |
| Pseudocode | Yes | We provide the details of algorithm in appendices. |
| Open Source Code | Yes | The code is available at https://github.com/silverbottlep/meta_curvature |
| Open Datasets | Yes | We evaluated our methods on few-shot regression and few-shot classification tasks over Omniglot [19], mini Imagenet [44], and tiered Imagnet [35] datasets. |
| Dataset Splits | Yes | The mini Imagenet dataset was proposed by [44, 34] and it consists of 100 subclasses out of 1000 classes in the original dataset (64 training classes, 12 validation classes, 24 test classes). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments. |
| Software Dependencies | No | The paper mentions using the ADAM optimizer but does not specify its version or the versions of any other software libraries or programming languages used. |
| Experiment Setup | Yes | The network architecture and all hyperparameters are same as [9] and we only introduce the suggested meta-curvature. We follow the experimental protocol in [9] and all hyperparameters and network architecture are same as [9]. We used 4 layers convolutional neural network with the batch normalization followed by a fully connected layer for the final classification. To avoid overfitting, we applied data augmentation techniques suggested in [5, 6]. |