Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot Meta-Learning
Authors: Nan Ding, Xi Chen, Tomer Levinboim, Sebastian Goodman, Radu Soricut
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 5, we conduct numerical experiments that empirically support the correctness of our theorems, and report the effectiveness of the new PACMAML algorithm, which obtains superior results on several few-shot benchmark datasets. |
| Researcher Affiliation | Industry | Nan Ding Google Research dingnan@google.com Xi Chen Google Research chillxichen@google.com Tomer Levinboim Google Research tomerl@google.com Sebastian Goodman Google Research seabass@google.com Radu Soricut Google Research rsoricut@google.com |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. Procedural steps are described in text. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code for the described methodology, nor does it include links to a code repository. |
| Open Datasets | Yes | We experiment with the synthetic Sinusoid environment (details in Appendix D.2)...Our first classification experiment is based on the mini Imagenet classification task [25]...Our experiment involves 12 practical natural language inference tasks from [4] which include: (1) entity typing: Co NLL-2003, MIT-Restaurant; (2) rating classification: the review ratings from the Amazon Reviews dataset in the domain of Books, DVD, Electronics, Kitchen; (3) text classification: social-media datasets from crowdflower that include Airline, Disaster, Emotion, Political Bias, Political Audience, Political Message...used GLUE benchmark tasks [26] for meta-training the models. |
| Dataset Splits | Yes | The number of training examples for each target task is fixed to be m = 5, and another 100 examples for each target task are used as a test set to evaluate the generalization error. We report the averaged generalization error over 40 models, with the hyperparameters selected by 4-fold cross-validation over the 20 target tasks...The dataset consists of 60,000 color images of 84 84 dimension. The examples consist of total 100 classes that are partitioned into 64, 12, and 24 classes for meta-train, meta-validation, and meta-test, respectively. |
| Hardware Specification | No | The paper mentions "TPU memory (High Bandwidth Memory)" in Table 2 (bottom) when comparing memory usage, implying the use of TPUs. However, it does not specify exact models of GPUs, CPUs, or specific TPU versions (e.g., TPU v2, v3) or other detailed computer specifications for the experiments. |
| Software Dependencies | No | The paper mentions "Tensorflow [1] or Pytorch [17]" for automatic gradient computations, but it does not specify version numbers for these software dependencies. |
| Experiment Setup | Yes | For all algorithms, we optimize for 6 steps in the inner loop to obtain the inner adaptive parameter (or a posterior sample w). The data sizes of the observed tasks are varied from mi = {10, 20, 40, 80} and m i = m = 5 (one shot for each of 5 classes). We fixed α/β = m i/mi and perform grid search on α as well as the meta and inner learning rate on the meta-validation dataset. Other hyperparameters followed the setting in [9]. |