reproducibilityindex.ai

Bayesian Few-Shot Classification with One-vs-Each Pólya-Gamma Augmented Gaussian Processes

Authors: Jake Snell, Richard Zemel

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present our results on few-shot classiﬁcation both in terms of accuracy and uncertainty quantiﬁcation. Additional results comparing the one-vs-each composite likelihood to the softmax, logistic softmax, and Gaussian likelihoods may be found in Section F.
Researcher Affiliation	Academia	Jake C. Snell University of Toronto Vector Institute jsnell@cs.toronto.edu Richard Zemel University of Toronto Vector Institute Canadian Institute for Advanced Research zemel@cs.toronto.edu
Pseudocode	Yes	Algorithm 1 One-vs-Each P olya-Gamma GP Learning
Open Source Code	Yes	We have made Py Torch code for our experiments publicly available2. https://github.com/jakesnell/ove-polya-gamma-gp
Open Datasets	Yes	We used the four dataset scenarios described below... CUB. Caltech-UCSD Birds (CUB) (Wah et al., 2011)... mini-Imagenet. The mini-Imagenet dataset (Vinyals et al., 2016)... Omniglot (Lake et al., 2011)... EMNIST dataset (Cohen et al., 2017)
Dataset Splits	Yes	mini-Imagenet. The mini-Imagenet dataset (Vinyals et al., 2016) consists of 100 classes with 600 images per class. We used the split proposed by Ravi & Larochelle (2017), which has 64 classes for training, 16 for validation, and 20 for test.
Hardware Specification	No	The paper does not provide specific details regarding the hardware used for experiments, such as GPU or CPU models.
Software Dependencies	No	We have made Py Torch code for our experiments publicly available2. https://github.com/jakesnell/ove-polya-gamma-gp... For P olya Gamma sampling we use the Py P olya Gamma package3. https://github.com/slinderman/pypolyagamma
Experiment Setup	Yes	All methods employed the commonly-used Conv4 architecture (Vinyals et al., 2016) (see Table 4 for a detailed speciﬁcation), except ABML which used 32 ﬁlters throughout. All of our experiments used the Adam (Kingma & Ba, 2015) optimizer with learning rate 10 3. During training, all models used epochs consisting of 100 randomly sampled episodes. A single gradient descent step on the encoder network and relevant hyperparameters is made per episode. All 1-shot models are trained for 600 epochs and 5-shot models are trained for 400 epochs... Each episode contained 5 classes (5-way) and 16 query examples.