reproducibilityindex.ai

Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound

Authors: Valentina Zantedeschi, Paul Viallard, Emilie Morvant, Rémi Emonet, Amaury Habrard, Pascal Germain, Benjamin Guedj

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we empirically evaluate STOCMV, and we compare its generalization bounds and test errors to those obtained with PAC-Bayesian methods learning majority votes.
Researcher Affiliation	Academia	Valentina Zantedeschi123 Paul Viallard4 Emilie Morvant4 Rémi Emonet4 Amaury Habrard4 Pascal Germain5 Benjamin Guedj123 1 Inria, Lille Nord Europe research centre, France 2 The Inria London Programme, France and UK 3 University College London, Department of Computer Science, Centre for Artiﬁcial Intelligence, UK 4 Univ Lyon, UJM-Saint-Etienne, CNRS, Institut d Optique Graduate School, Laboratoire Hubert Curien UMR 5516, F-42023, Saint-Etienne, France 5 Département d informatique et de génie logiciel, Université Laval, Québec, Canada
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	Code, available at https://github.com/vzantedeschi/Stoc MV, was implemented in pytorch [Paszke et al., 2019] and all experiments were run on a virtual machine with 8 v CPUs and 128Gb of RAM.
Open Datasets	Yes	We study the performance of our method on the binary classiﬁcation two-moons dataset, with 2 features, 2 classes and Np0, 0.05q Gaussian noise, for which we draw n points for training, and 1, 000 points for testing. [...] We consider several classiﬁcation datasets from UCI [Dua and Graff, 2017], LIBSVM2 and Zalando [Xiao et al., 2017], of different number of features and of instances.
Dataset Splits	No	No explicit, detailed training/test/validation dataset splits for the main model were provided with exact percentages or sample counts. For the two-moons dataset, 'n points for training, and 1, 000 points for testing' is given. For data-dependent priors, 'we split the training data S into two subsets (S≤m tpxi, yiq P Sum i 1 and Sąm tpxi, yiq P Sun i m 1)' is mentioned, and 'patience equal to 25 for early stopping' implies a validation set, but its specific split size or usage is not detailed for the main learning process.
Hardware Specification	Yes	Code, available at https://github.com/vzantedeschi/Stoc MV, was implemented in pytorch [Paszke et al., 2019] and all experiments were run on a virtual machine with 8 v CPUs and 128Gb of RAM.
Software Dependencies	No	The paper mentions 'pytorch [Paszke et al., 2019]' but does not provide a specific version number for it or any other software dependencies, which is required for reproducibility.
Experiment Setup	Yes	For this set of experiments, we optimize Seeger s Bound (Equation (1)) by (batch) Gradient Descent, for 1, 000 iterations and with learning rate equal to 0.1. [...] We train the models by Stochastic Gradient Descent (SGD) using Adam [Kingma and Ba, 2015] with p0.9, 0.999q running average coefﬁcients, batch size equal to 1024 and learning rate equal to 0.1 with a scheduler reducing this parameter of a factor of 10 with 2 epochs patience. We ﬁx the maximal number of epochs to 100 and patience equal to 25 for early stopping, and for MC we ﬁx T 10 to increase randomness.