Learning to Continually Learn with the Bayesian Principle
Authors: Soochan Lee, Hyeonseong Jeon, Jaehyeon Son, Gunhee Kim
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the efficacy of our framework on a wide range of domains, including both supervised and unsupervised tasks. We also provide Py Torch (Paszke et al., 2019) code, ensuring the reproducibility of all experiments. Due to page limitations, we present only the most essential information; for further details, please refer to the code. We present our classification, regression, and deep generative modeling results in Table 2, 3, and 4, respectively. Fig. 3 compares the generalization abilities in longer training streams, while Table 5 summarizes generalization to a different dataset. |
| Researcher Affiliation | Academia | Soochan Lee 1 Hyeonseong Jeon 1 Jaehyeon Son 1 Gunhee Kim 1 1Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea. Correspondence to: Gunhee Kim <gunhee@snu.ac.kr>. |
| Pseudocode | No | The paper describes the methodology using mathematical equations and textual descriptions, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/soochan-lee/SB-MCL. |
| Open Datasets | Yes | We conduct experiments with the Omniglot, CASIA, and Celeb datasets, following the setups of Lee et al. (2023). As the popular Omniglot dataset (Lake et al., 2015) causes severe meta-overfitting due to its small size (1.6K classes / 32K images), they repurpose CASIA (Liu et al., 2011) and MS-Celeb-1M (Guo et al., 2016) datasets for MCL. |
| Dataset Splits | No | The paper describes meta-training and meta-testing sets, and states that for offline learning, "we report the best test score achieved during training", implying some form of internal validation, but it does not specify explicit training/validation/test dataset splits with percentages or sample counts for reproduction. |
| Hardware Specification | Yes | We report the time required to meta-train for 50K steps with a single A40 GPU. |
| Software Dependencies | No | The paper mentions "Py Torch (Paszke et al., 2019) code" but does not provide a specific version number for PyTorch or other software dependencies. |
| Experiment Setup | Yes | In all MCL experiments, we meta-train the methods in a 10-task 10-shot setting: each training stream is a concatenation of 10 tasks with 10 examples each. The hyperparameters are tuned to maximize the performance in the 10-task 10-shot settings. |