reproducibilityindex.ai

Revisiting the Effects of Stochasticity for Hamiltonian Samplers

Authors: Giulio Franzese, Dimitrios Milios, Maurizio Filippone, Pietro Michiardi

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our theoretical results are supported by an empirical study on a variety of regression and classiﬁcation tasks for Bayesian neural networks.In Section 5, we conduct an extensive experimental campaign that corroborates our theory on the convergence rate of various Hamiltonian-based SDE schemes, by exploring the behavior of step size and mini-batch size for a large number of models and datasets.
Researcher Affiliation	Academia	1Data Science Department, EURECOM, France.
Pseudocode	No	The paper describes numerical integrators using mathematical equations and descriptions, but no clearly labeled 'Pseudocode' or 'Algorithm' blocks were found.
Open Source Code	No	The paper mentions 'Our implementation is loosely based on the PYSGMCMC framework' and provides a link to the PYSGMCMC GitHub repository. However, it does not state that the specific code for their described methodology or experiments is being released or is available.
Open Datasets	Yes	We consider four regression datasets (BOSTON, CONCRETE, ENERGY, YACHT) and two classiﬁcation datasets (IONOSPHERE, VEHICLE) from the UCI repository, as well as a 1-D synthetic dataset, for which the regression result is shown in Appendix F.1.Throughout our experimental campaign, we work on a number of datasets from the UCI repository3. 3https://archive.ics.uci.edu/ml/index.php
Dataset Splits	No	For each dataset, we have considered 5 random splits into training and test sets; for the regression datasets, we have adopted the splits in Mukhoti et al. (2018) which are available online4 under the Creative Commons licence. The paper explicitly mentions training and test sets, but not a validation set split.
Hardware Specification	Yes	A full exploration of the methods and integrators for a single split (or seed) of a single dataset required approximately one day of computation on a computer cluster featuring Intel Xeon CPU @2.00GHz.
Software Dependencies	No	The paper mentions 'PYSGMCMC framework' but does not specify any software names with version numbers for dependencies required to reproduce the experiments.
Experiment Setup	Yes	For regression tasks we consider BNNs with 4 layers and 50 nodes per layer with RELU activation.In all cases we set C = 5;For the oracles we consider full-batch and η = 0.005 in all casesIn practice we keep one sample every 500 steps (i.e. thinning), and we discard the ﬁrst 2000 steps.