reproducibilityindex.ai

PAC-Bayes-Chernoff bounds for unbounded losses

Authors: Ioar Casado Telletxea, Luis Antonio Ortega Andrés, Aritz Pérez, Andres Masegosa

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Figure 1: Models with very different CGFs coexist within the same model class. On the left, we display several metrics for Inception V3 models trained on CIFAR10 without regularization (Standard) and with L2 regularization (L2). Random refers to a model learned over randomly reshuffled labels and Zero refers to a model where all the weights are equal to zero. For each model, the metrics include train and test accuracy, test log-loss, ℓ2-norm of the parameters of the model, the variance of the log-loss function, denoted Vν(ℓ(x, θ)), and the expected norm of the input-gradients, denoted Eν xℓ(x, θ) 2 2 . On the right, we display the estimated CGFs of each model, following Masegosa and Ortega (2023). Note how models with smaller variance V(ℓ(x, θ)), ℓ2-norm or inputgradient norm Eν xℓ(x, θ) 2 2 have smaller CGFs. Bounds derived from Theorem 7 naturally exploit these differences. Experimental details in Appendix C.
Researcher Affiliation	Academia	Ioar Casado Machine Learning Group Basque Center for Applied Mathematics (BCAM) icasado@bcamath.org Luis A. Ortega Machine Learning Group Computer Science Dept. EPS. Universidad Autónoma de Madrid luis.ortega@uam.es Aritz Pérez Machine Learning Group Basque Center for Applied Mathematics (BCAM) aperez@bcamath.org Andrés R. Masegosa Department of Computer Science Aalborg University arma@cs.aau.dk
Pseudocode	No	No pseudocode or algorithm blocks are present in the paper.
Open Source Code	Yes	We include a Jupyter Notebook in the Supplementary Material with the code for reproducing our experiments and figures.
Open Datasets	Yes	We trained the model in the CIFAR10 dataset (Krizhevsky et al., 2009) with the default train/test split...
Dataset Splits	No	We trained the model in the CIFAR10 dataset (Krizhevsky et al., 2009) with the default train/test split using SGD with momentum 0.9 and learning rate 0.01 with exponential decay of 0.95.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) are provided.
Software Dependencies	No	The paper mentions using a 'Jupyter Notebook' and general terms like 'SGD' but does not provide specific version numbers for software dependencies like Python, PyTorch, or other libraries.
Experiment Setup	Yes	We trained the model in the CIFAR10 dataset (Krizhevsky et al., 2009) with the default train/test split using SGD with momentum 0.9 and learning rate 0.01 with exponential decay of 0.95. All models are trained for 30.000 iterations of batches of size 200 or until the train loss is under 0.005. These settings are selected to ensure that the random label model converges to an interpolator. For ℓ2 regularization, the multiplicative factor is 0.01.