More Flexible PAC-Bayesian Meta-Learning by Learning Learning Algorithms
Authors: Hossein Zakerinia, Amin Behjati, Christoph H. Lampert
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Other than our theoretical contributions we also show empirically that our framework improves the prediction quality in practical meta-learning mechanisms. We also report on two experimental studies that allow us to better relate our results to prior work. |
| Researcher Affiliation | Academia | 1Institute of Science and Technology Austria (ISTA) 2Sharif University of Technology. Correspondence to: Hossein Zakerinia <Hossein.Zakerinia@ist.ac.at>. |
| Pseudocode | No | The paper does not contain structured pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Our implementation is based on the code of Amit & Meir (2018), except that we fixed a bug in their computation of the KL-divergences, which was also present in later works derived from it. Furthermore, we corrected an issue with how the gradients of the objective in Rezazadeh (2022) were computed. All experiments were done with the corrected implementation.1 (Footnote 1 links to https://github.com/hzakerinia/Flexible-PAC-Bayes-Meta-Learning/) |
| Open Datasets | Yes | The experiment consists of two types of tasks based on the MNIST dataset (Le Cun & Cortes, 1998). |
| Dataset Splits | Yes | In these experiments, there are 10 training tasks with 600 samples per task. We evaluate the methods on 20 tasks with 100 samples per task. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and Cross-Entropy loss, but it does not specify version numbers for any key software components like Python, PyTorch, or other libraries, which are necessary for reproducible software dependencies. |
| Experiment Setup | Yes | We used the Adam optimizer with a learning rate of 10-3 and the number of Monte Carlo iterations was 1 in all experiments. For the fixed parameters of the minimization objective, we put δ = 0.1, and for the variances of meta-prior π and meta-posterior ρ, we set κπ = 102 and κρ = 10-3. Moreover, the batch size is 128 and the used loss function is Cross-Entropy loss. In the first 100 epochs, we assume ρ0 and ρ1 are equal and we minimize the bound to find ρ0 = ρ1 and posteriors Qis. After 100 epochs, we fix ρ0, initialize the Qis by sampling from ρ0 (Since ρ0 is supposed to be the meta-distribution over initialization prior) and optimize the bound for ρ1 and the Qis for another 100 epochs. |