reproducibilityindex.ai

Parallel Affine Transformation Tuning of Markov Chain Monte Carlo

Authors: Philip Schär, Michael Habeck, Daniel Rudolf

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The results of several numerical experiments with PATT are presented in Section 5. We conclude the paper’s main body with some final remarks in Section 6. In addition, we offer complementary information on ATT and related topics in the supplementary material: Appendices A, B and C provide detailed considerations and guidelines regarding the choices of the adjustment types, transformation parameters and update schedules defined in Sections 2 and 3. Theses appendices may serve as a cookbook for implementing and/or applying ATT or PATT. In place of a related work section, we give a detailed overview of connections between our method and various others in Appendix D. In Appendix E, we prove that in certain cases a simple adaptive MCMC implementation of ATT is equivalent to other, more traditional adaptive MCMC methods, in that the respective transition kernels coincide. The proof of our theoretical result from Section 4 is provided in Appendix F. In Appendix G we elaborate on the models and hyperparameter choices for the experiments behind the results presented in Section 5, and provide some further results. Appendix H presents a series of ablation studies demonstrating that each non-essential component of PATT can, in principle, substantially improve its performance. Appendix I offers more plots illustrating the main experiments as well as the ablation studies.
Researcher Affiliation	Academia	1Microscopic Image Analysis Group, Friedrich Schiller University Jena, Jena, Germany 2Faculty of Computer Science and Mathematics, University of Passau, Passau, Germany. Correspondence to: Daniel Rudolf <daniel.rudolf@uni-passau.de>.
Pseudocode	Yes	Algorithm 1 ATT transition Algorithm 2 PATT
Open Source Code	Yes	The source code for our numerical experiments is provided as a github repository6.
Open Datasets	Yes	For our second BLR experiment, again following Nishihara et al. (2014), we used the breast cancer Wisconsin (diagnostic) data set (Street et al., 1995)... In our third BLR experiment, we used the Pima diabetes data (Smith et al., 1988)... In our fourth and final experiment on BLR, we used the red wine quality data set (Cortez et al., 2009)... As data for the model we used a small subset of county-wise accumulations of some recent US census data, which we obtained from Kaggle11.
Dataset Splits	No	When numerically analyzing the sampling performance of PATT and its competitors, we were more interested in their respective long-term efficiency than in their behavior in the early stages. We therefore used a generous burn-in period, in that we considered only those samples generated in the latter half of iterations for this analysis.
Hardware Specification	No	In order to ensure that our experiments could be executed unaltered on a regular workstation (for the sake of good reproducibility), we ran them on such a machine ourselves. This led us to choose p := 10 (slightly less than the number of available processor cores on our machine) for each of the aforementioned methods throughout all of the experiments.
Software Dependencies	No	Instead we relied on the Python interface Py Stan8 of the software package Stan.
Experiment Setup	Yes	For PATT and the naively parallelized versions of HRUSS, Ada RWM and NUTS that we used to run these three methods, we could freely choose the number p of parallel chains maintained by each method. ... we set a parameter nits N. ... for Ada RWM, ... we set β := 0.05. ... we used σ2 = 10 2... we imposed the independent exponential prior... with fixed rate r = 0.1.