reproducibilityindex.ai

Entropy-MCMC: Sampling from Flat Basins with Ease

Authors: Bolian Li, Ruqi Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results demonstrate that our method can successfully sample from flat basins of the posterior, and outperforms all compared baselines on multiple benchmarks including classification, calibration, and out-of-distribution detection. and 5 THEORETICAL ANALYSIS
Researcher Affiliation	Academia	Bolian Li, Ruqi Zhang Department of Computer Science, Purdue University, USA {li4468,ruqiz}@purdue.edu
Pseudocode	Yes	Algorithm 1: Entropy-MCMC
Open Source Code	Yes	We release the code at https://github.com/lblaoke/EMCMC.
Open Datasets	Yes	We conduct logistic regression on MNIST (Le Cun, 1998)... We conduct classification experiments on CIFAR (Krizhevsky, 2009), corrupted CIFAR (Hendrycks & Dietterich, 2019b) and Image Net (Deng et al., 2009)... Then we use the uncertainty to detect SVHN samples in a joint testing set combined by CIFAR and SVHN (Netzer et al., 2011).
Dataset Splits	No	The paper refers to using standard datasets but does not provide specific details on how the data was split into training, validation, and test sets (e.g., percentages, sample counts, or explicit reference to predefined splits). Phrases like 'We train each model on CIFAR' are used without further partitioning details.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or cloud computing specifications used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependency details, such as programming language versions or library version numbers (e.g., Python 3.x, PyTorch 1.x), needed to replicate the experiment environment.
Experiment Setup	Yes	Following Zhang et al. (2020b), we adopt a cyclical step size schedule for all sampling methods. For more implementation details, please refer to Appendix E. Appendix E discusses several important hyper-parameters and algorithm settings, including the variance term η, step size schedules, temperature T, normalization layers, and SGD burn-in epochs, with specific values and comparisons (e.g., η = 0.5, α = 5 * 10^-3 in synthetic examples, Table 4 for temperature magnitudes, Table 6 for step size schedules, Table 10 for SGD burn-in epochs).