reproducibilityindex.ai

Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot Meta-Learning

Authors: Nan Ding, Xi Chen, Tomer Levinboim, Sebastian Goodman, Radu Soricut

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 5, we conduct numerical experiments that empirically support the correctness of our theorems, and report the effectiveness of the new PACMAML algorithm, which obtains superior results on several few-shot benchmark datasets.
Researcher Affiliation	Industry	Nan Ding Google Research dingnan@google.com Xi Chen Google Research chillxichen@google.com Tomer Levinboim Google Research tomerl@google.com Sebastian Goodman Google Research seabass@google.com Radu Soricut Google Research rsoricut@google.com
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. Procedural steps are described in text.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code for the described methodology, nor does it include links to a code repository.
Open Datasets	Yes	We experiment with the synthetic Sinusoid environment (details in Appendix D.2)...Our ﬁrst classiﬁcation experiment is based on the mini Imagenet classiﬁcation task [25]...Our experiment involves 12 practical natural language inference tasks from [4] which include: (1) entity typing: Co NLL-2003, MIT-Restaurant; (2) rating classiﬁcation: the review ratings from the Amazon Reviews dataset in the domain of Books, DVD, Electronics, Kitchen; (3) text classiﬁcation: social-media datasets from crowdﬂower that include Airline, Disaster, Emotion, Political Bias, Political Audience, Political Message...used GLUE benchmark tasks [26] for meta-training the models.
Dataset Splits	Yes	The number of training examples for each target task is ﬁxed to be m = 5, and another 100 examples for each target task are used as a test set to evaluate the generalization error. We report the averaged generalization error over 40 models, with the hyperparameters selected by 4-fold cross-validation over the 20 target tasks...The dataset consists of 60,000 color images of 84 84 dimension. The examples consist of total 100 classes that are partitioned into 64, 12, and 24 classes for meta-train, meta-validation, and meta-test, respectively.
Hardware Specification	No	The paper mentions "TPU memory (High Bandwidth Memory)" in Table 2 (bottom) when comparing memory usage, implying the use of TPUs. However, it does not specify exact models of GPUs, CPUs, or specific TPU versions (e.g., TPU v2, v3) or other detailed computer specifications for the experiments.
Software Dependencies	No	The paper mentions "Tensorﬂow [1] or Pytorch [17]" for automatic gradient computations, but it does not specify version numbers for these software dependencies.
Experiment Setup	Yes	For all algorithms, we optimize for 6 steps in the inner loop to obtain the inner adaptive parameter (or a posterior sample w). The data sizes of the observed tasks are varied from mi = {10, 20, 40, 80} and m i = m = 5 (one shot for each of 5 classes). We ﬁxed α/β = m i/mi and perform grid search on α as well as the meta and inner learning rate on the meta-validation dataset. Other hyperparameters followed the setting in [9].