Entropy-MCMC: Sampling from Flat Basins with Ease

Authors: Bolian Li, Ruqi Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results demonstrate that our method can successfully sample from flat basins of the posterior, and outperforms all compared baselines on multiple benchmarks including classification, calibration, and out-of-distribution detection. and 5 THEORETICAL ANALYSIS
Researcher Affiliation Academia Bolian Li, Ruqi Zhang Department of Computer Science, Purdue University, USA {li4468,ruqiz}@purdue.edu
Pseudocode Yes Algorithm 1: Entropy-MCMC
Open Source Code Yes We release the code at https://github.com/lblaoke/EMCMC.
Open Datasets Yes We conduct logistic regression on MNIST (Le Cun, 1998)... We conduct classification experiments on CIFAR (Krizhevsky, 2009), corrupted CIFAR (Hendrycks & Dietterich, 2019b) and Image Net (Deng et al., 2009)... Then we use the uncertainty to detect SVHN samples in a joint testing set combined by CIFAR and SVHN (Netzer et al., 2011).
Dataset Splits No The paper refers to using standard datasets but does not provide specific details on how the data was split into training, validation, and test sets (e.g., percentages, sample counts, or explicit reference to predefined splits). Phrases like 'We train each model on CIFAR' are used without further partitioning details.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or cloud computing specifications used for running the experiments.
Software Dependencies No The paper does not provide specific software dependency details, such as programming language versions or library version numbers (e.g., Python 3.x, PyTorch 1.x), needed to replicate the experiment environment.
Experiment Setup Yes Following Zhang et al. (2020b), we adopt a cyclical step size schedule for all sampling methods. For more implementation details, please refer to Appendix E. Appendix E discusses several important hyper-parameters and algorithm settings, including the variance term η, step size schedules, temperature T, normalization layers, and SGD burn-in epochs, with specific values and comparisons (e.g., η = 0.5, α = 5 * 10^-3 in synthetic examples, Table 4 for temperature magnitudes, Table 6 for step size schedules, Table 10 for SGD burn-in epochs).