A Unified View on PAC-Bayes Bounds for Meta-Learning

Authors: Arezou Rezazadeh

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical examples demonstrate the merits of the proposed novel bounds and algorithm in comparison to prior PAC-Bayes bounds for meta-learning. Table 2 shows the comparison of different PAC-Bayes bounds for both permuted pixels and labels experiments.
Researcher Affiliation Academia 1Department of Electrical Engineering, Chalmers University of Technology, Gothenburg, Sweden. Correspondence to: Arezou Rezazadeh <arezour@chalmers.se>.
Pseudocode No The paper contains mathematical derivations and theoretical bounds but no explicit pseudocode or algorithm blocks.
Open Source Code No We reproduce the experimental results of our method by directly running the online code1 from (Amit & Meir, 2018), and run our algorithm by replacing others bounds with our bounds. 1https://github.com/ron-amit/meta-learning-adjusting-priors2 (The paper refers to a third-party codebase they used, but does not explicitly state that their own implementation of the proposed bounds or modifications are open-sourced.)
Open Datasets Yes We consider an experiment based on augmentations of the MNIST dataset.
Dataset Splits No The number of training task is set as N = 5. The number of epochs is 100. (The paper mentions training and testing phases/sets but does not specify a distinct validation set or its split for hyperparameter tuning.)
Hardware Specification No The paper does not provide any specific details about the hardware used for the experiments.
Software Dependencies No The paper does not specify version numbers for any software dependencies (e.g., Python, PyTorch, or other libraries).
Experiment Setup Yes With a learning rate of 10^-3, we use the hyper-prior, prior, hyper-posterior and posterior distributions given by (23), (25), (24) and (26), respectively. We set κp^2 = 100, κs^2 = 0.001, and δ = 0.1. For each task τi, and k = 1, . . . , d, the posterior parameter log(σi^2(k)) initialized by N(-10, 0.01), µi(k) is initialized randomly with the Glorot method (Glorot & Bengio, 2010). The number of epochs is 100.