Split-kl and PAC-Bayes-split-kl Inequalities for Ternary Random Variables
Authors: Yi-Shan Wu, Yevgeny Seldin
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present an extensive set of experiments, where we first compare the kl, Empirical Bernstein, Unexpected Bernstein, and split-kl inequalities applied to (individual) sums of independent random variables in simulated data, and then compare the PAC-Bayes-kl, PAC-Bayes-Unexpected-Bersnstein, PAC-Bayes-split-kl, and, in some of the setups, PAC-Bayes-Empirical-Bennett, for several prediction models on several UCI datasets. |
| Researcher Affiliation | Academia | Yi-Shan Wu University of Copenhagen yswu@di.ku.dk Yevgeny Seldin University of Copenhagen seldin@di.ku.dk |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | New code in the supplementary. |
| Open Datasets | Yes | We evaluate the performance of the PAC-Bayes-split-kl inequality in linear classification and in weighted majority vote using several data sets from UCI and Lib SVM repositories [Dua and Graff, 2019, Chang and Lin, 2011]. |
| Dataset Splits | Yes | If we split S into two equal parts, S = S1 S2, we can use S1 to train both a reference prediction rule h S1 and a prior πS1, and then learn a PAC-Bayes posterior on S2, and the other way around. By combining the 'forward' and 'backward' approaches we can write Eρ[L(h)] = 1 2Eρ[ L(h, h S1)] + 1 2Eρ[ L(h, h S2)] + 1 2 (L(h S1) + L(h S2)) |
| Hardware Specification | Yes | All experiments were performed on a local server equipped with an Intel Core i9-9900K CPU and an NVIDIA GeForce RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions software like TensorFlow and optimization algorithms like Adam and Rprop, but does not provide specific version numbers for these software dependencies (e.g., 'TensorFlow 2.x' or 'Rprop vX.Y.Z'). |
| Experiment Setup | Yes | In the experiments we take δ = 0.05, and truncate the bounds at 1. For the Unexpected Bernstein bound we take a grid of γ {1/(2b), , 1/(2kb)} for k = log2( p n/ ln(1/δ)/2) and a union bound over the grid, as proposed by Mhammedi et al. [2019]. For the split-kl bound we take µ to be the middle value, 0, of the ternary random variable. |