Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
More Flexible PAC-Bayesian Meta-Learning by Learning Learning Algorithms
Authors: Hossein Zakerinia, Amin Behjati, Christoph H. Lampert
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Other than our theoretical contributions we also show empirically that our framework improves the prediction quality in practical meta-learning mechanisms. We also report on two experimental studies that allow us to better relate our results to prior work. |
| Researcher Affiliation | Academia | 1Institute of Science and Technology Austria (ISTA) 2Sharif University of Technology. Correspondence to: Hossein Zakerinia <EMAIL>. |
| Pseudocode | No | The paper does not contain structured pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Our implementation is based on the code of Amit & Meir (2018), except that we fixed a bug in their computation of the KL-divergences, which was also present in later works derived from it. Furthermore, we corrected an issue with how the gradients of the objective in Rezazadeh (2022) were computed. All experiments were done with the corrected implementation.1 (Footnote 1 links to https://github.com/hzakerinia/Flexible-PAC-Bayes-Meta-Learning/) |
| Open Datasets | Yes | The experiment consists of two types of tasks based on the MNIST dataset (Le Cun & Cortes, 1998). |
| Dataset Splits | Yes | In these experiments, there are 10 training tasks with 600 samples per task. We evaluate the methods on 20 tasks with 100 samples per task. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and Cross-Entropy loss, but it does not specify version numbers for any key software components like Python, PyTorch, or other libraries, which are necessary for reproducible software dependencies. |
| Experiment Setup | Yes | We used the Adam optimizer with a learning rate of 10-3 and the number of Monte Carlo iterations was 1 in all experiments. For the fixed parameters of the minimization objective, we put Ξ΄ = 0.1, and for the variances of meta-prior Ο and meta-posterior Ο, we set ΞΊΟ = 102 and ΞΊΟ = 10-3. Moreover, the batch size is 128 and the used loss function is Cross-Entropy loss. In the first 100 epochs, we assume Ο0 and Ο1 are equal and we minimize the bound to find Ο0 = Ο1 and posteriors Qis. After 100 epochs, we fix Ο0, initialize the Qis by sampling from Ο0 (Since Ο0 is supposed to be the meta-distribution over initialization prior) and optimize the bound for Ο1 and the Qis for another 100 epochs. |