Online PAC-Bayes Learning

Authors: Maxime Haddouche, Benjamin Guedj

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose then several algorithms with their associated training and test bounds as well as a short series of experiments to evaluate the consistency of our online PAC-Bayesian approach. Our efficiency criterion is not the classical regret but an expected cumulative loss close to the one of Wintenberger [2021]. More precisely, Sec. 3 propose a stable yet time-consuming Gibbs-based algorithm, while Sec. 4 proposes time efficient yet volatile algorithms. We emphasize that our PAC-Bayesian results only require a bounded loss to hold: no assumption is made on the data distribution, priors can be data-dependent and we do not require any convexity assumption on the loss, as commonly assumed in the OL framework. Sec. 5 gathers supporting experiments.
Researcher Affiliation Academia Maxime Haddouche Inria and University College London France and UK Benjamin Guedj Inria and University College London France and UK
Pseudocode Yes Algorithm 1: A general OPBD algorithm for Gaussian measures with fixed variance.
Open Source Code Yes anonymised code available here. Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] We have included the url in Sec. 5, note that this is an anonymous repository.
Open Datasets Yes We consider four real world dataset: two for classification (Breast Cancer and Pima Indians), and two for regression (Boston Housing and California Housing). All datasets except the Pima Indians have been directly extracted from sklearn [Pedregosa et al., 2011]. Breast Cancer dataset [Street et al., 1993] is available here and comes from the UCI ML repository as well as the Boston Housing dataset [Belsley et al., 2005] which can be obtained here. California Housing dataset [Pace and Barry, 1997] comes from the Stat Lib repository and is available here. Finally, Pima Indians dataset [Smith et al., 1988] has been recovered from this Kaggle repository.
Dataset Splits No The paper mentions using several datasets and permuting observations, but it does not specify any training, validation, or test splits by percentage or sample count, nor does it refer to predefined splits.
Hardware Specification Yes We ran our experiments on a 2021 Mac Book Pro with an M1 chip and 16 Gb RAM.
Software Dependencies No The paper mentions 'extracted from sklearn [Pedregosa et al., 2011]' but does not provide a specific version number for scikit-learn or any other software dependencies used in the experiments.
Experiment Setup Yes For OGD, the initialisation point is 0Rd and the values of the learning rates are set to = 1/pm. For SVB, mean is initialised to 0Rd and covariance matrix to Diag(1). Step at time i is i = 0.1/i. For both of the OPB algorithms with Gibbs posterior, we chose λ = 1/m. As priors, we took respectively a centered Gaussian vector with the covariance matrix Diag(σ2) (σ = 1.5) and an iid vector following the standard Laplace distribution. For the OPBD algorithm with 1, we chose λ = 10 4/m, the initial mean is 0Rd and our fixed covariance matrix is Diag(σ2) with σ = 3.10 3. For the OPBD algorithm with 1, we chose λ = 2.10 3/m, the initial mean is 0Rd and our covariance matrix is Diag(σ2) with σ = 10 2.