reproducibilityindex.ai

Non-reversible Parallel Tempering for Deep Posterior Approximation

Authors: Wei Deng, Qian Zhang, Qi Feng, Faming Liang, Guang Lin

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments Simulations of Multi-Modal Distributions We first simulate the proposed algorithm on a distribution π(β) exp( U(β)), where β = (β1, β2), U(β) = 0.2(β2 1 + β2 2) 2(cos(2πβ1) + cos(2πβ2)). The heat map is shown in Figure 3(a) with 25 modes of different volumes. To mimic big data scenarios, we can only access stochastic gradient e U(β) = U(β) + 2N(0, I2 2) and stochastic energy e U(β) = U(β) + 2N(0, I).
Researcher Affiliation	Collaboration	1 Purdue University, West Lafayette, IN 2 Morgan Stanley, New York, NY 3 University of Michigan, Ann Arbor, MI
Pseudocode	Yes	Algorithm 1: Non-reversible parallel tempering with SGD-based exploration kernels (DEO -SGD).
Open Source Code	Yes	Code The code is released to https://github.com/WayneDW/Non-reversible-Parallel-Tempering-for-Deep-Posterior-Approximation for reproduction.
Open Datasets	Yes	We choose Res Net20, Res Net32, and Res Net56 and train the models on CIFAR100.
Dataset Splits	No	The paper mentions training models on CIFAR100 but does not specify any explicit train/validation/test dataset splits, percentages, or methodology for splitting the data.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers.
Experiment Setup	Yes	We first run DEO -SGD P16 based on 16 chains and 20,000 iterations. We fix the lowest learning rate 0.003 and the highest learning 0.6 and propose to tune the target swap rate S for the acceleration-accuracy trade-off. ... For each model, we first pre-train 10 fixed models via 300 epochs and then run algorithms based on momentum SGD (m SGD) for 500 epochs with 10 parallel chains... We fix the lowest and highest learning rates as 0.005 and 0.02, respectively.