reproducibilityindex.ai

PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees

Authors: Jonas Rothfuss, Vincent Fortuin, Martin Josifoski, Andreas Krause

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, we instantiate our framework with Gaussian Processes (GPs) and Bayesian Neural Networks (BNNs) as base learners. Across several regression and classiﬁcation environments, our proposed approach achieves state-of-the-art predictive accuracy, while also improving the calibration of the uncertainty estimates.
Researcher Affiliation	Academia	1ETH Zurich, Switzerland 2EPFL, Switzerland.
Pseudocode	Yes	Algorithm 1 PACOH with SVGD approximation of Q
Open Source Code	Yes	The source code for PACOH-GP is available at tinyurl.com/pacoh-gp-code. An implementation of PACOH-NN can be found at tinyurl.com/pacoh-nn-code.
Open Datasets	Yes	Swiss Free Electron Laser (Swiss FEL) (Milne et al., 2017; Kirschner et al., 2019b), Physio Net 2012 challenge (Silva et al., 2012), Intel Berkeley Research Lab temperature sensor dataset (Berkeley-Sensor) (Madden, 2004), Omniglot (Lake et al., 2015)
Dataset Splits	No	The paper mentions 30 meta-train and 20 meta-test tasks for Omniglot, and refers to 'target training' and 'target testing' in Figure 1. However, it does not provide specific percentages or counts for train/validation/test dataset splits within each task in the main text.
Hardware Specification	No	The paper discusses computational complexity and memory usage but does not provide specific hardware details such as GPU/CPU models, memory amounts, or detailed computer specifications used for running the experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiments.
Experiment Setup	Yes	we use λ = n, β = m, the negative log-likelihood as loss function and a Gaussian hyper-prior P = N(0, σ2 PI) over the GP prior parameters φ. For regression, we may set p(y\|x, θ) = N(y\|hθ(x), σ2)... For classiﬁcation, we choose p(y\|x, θ) = Categorical(softmax(hθ(x))). Our loss function is the negative log-likelihood... we employ diagonal Gaussian priors, that is, Pφl = N(µPk, diag(σ2 Pk)) with φ := (µPk, ln σPk)... Moreover, we use a zero-centered, spherical Gaussian hyper-prior P := N(0, σ2 PI) over the prior parameters φ. Input: SVGD kernel function k( , ), step size η.