Meta-Learning via PAC-Bayesian with Data-Dependent Prior: Generalization Bounds from Local Entropy
Authors: Shiyu Liu, Wei Shi, Zenglin Xu, Shaogao Lv, Yehong Zhang, Hui Wang
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate that our proposed method outperforms the other baselines. In this section, we validate our theoretical results and demonstrate the efficiency of the algorithm through numerical experiments, including tasks related to few-shot regression and image classification. |
| Researcher Affiliation | Academia | 1University of Electronic Science and Technology of China 2Artificial Intelligence Innovation and Incubation (AI3) Institute, Fudan University 3Nanjing audit university 4Peng Cheng Laboratory 5Harbin Institute of Technology, Shenzhen |
| Pseudocode | Yes | Algorithm 1 PAC-MLE algorithm: meta-training phase |
| Open Source Code | No | The paper does not provide any specific repository link or explicit code release statement for the methodology described. |
| Open Datasets | Yes | Sinusoids environment [Finn et al., 2017; Finn et al., 2018]... We utilize datasets corresponding to various calibration sessions of the Swiss Free Electron Laser (Swiss FEL) [Milne et al., 2017]... Physio Net 2012 challenge [Silva et al., 2012]... Intel Berkeley Research Lab temperature sensor dataset (Berkeley-Sensor) [Madden, 2004]... augmented MNIST dataset |
| Dataset Splits | Yes | During the meta-training phase, we choose 10 training tasks, each consisting of 60,000 training examples; while in the meta-test phase, each task is constructed with reduced training samples, specifically 2,000. |
| Hardware Specification | No | The paper does not provide specific hardware details (like GPU/CPU models or memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | To solve each regression task, we employ a fully connected network with 2 hidden layers, each with 64 hidden units. We utilize a four-layer fully-connected network for shuffled pixels experiments, and a four-layer convolutional network for permuted labels experiments. Require: hyper-prior P, datasets {Si}n i=1, learning rate η, η , weight average α, initialize ν0 P |