reproducibilityindex.ai

Parallel tempering on optimized paths

Authors: Saifuddin Syed, Vittorio Romaniello, Trevor Campbell, Alexandre Bouchard-Cote

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we study the empirical performance of non-reversible PT based on the spline annealing path family (K {2, 3, 4, 5, 10}) from Section 4, with knots and schedule optimized using the tuning method from Section 3. We compare this method to two PT methods based on standard linear paths: non-reversible PT with adaptive schedule ( NRPT+Linear ) (Syed et al., 2019), and reversible PT ( Reversible+Linear ) (Atchad e et al., 2011). Code for the experiments is available at https: //github.com/vittrom/PT-pathoptim. We run the following benchmark problems; see the supplement for details. Gaussian: a synthetic setup... Beta-binomial model... Galaxy data... High dimensional Gaussian... The results of these experiments are shown in Figures 3 and 4.
Researcher Affiliation	Academia	1Department of Statistics, University of British Columbia, Vancouver, Canada. Correspondence to: Saifuddin Syed <saif.syed@stat.ubc.ca>, Vittorio Romaniello <vittorio.romaniello@stat.ubc.ca>.
Pseudocode	Yes	Algorithm 1 NRPT Algorithm 2 Path Opt NRPT
Open Source Code	Yes	Code for the experiments is available at https: //github.com/vittrom/PT-pathoptim.
Open Datasets	Yes	Galaxy data: A Bayesian Gaussian mixture model applied to the galaxy dataset of (Roeder, 1990).
Dataset Splits	No	The paper describes the datasets used for experiments but does not specify train, validation, or test splits for data partitioning or model evaluation in a conventional sense for supervised learning tasks. The problems are sampling-based.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for the experiments.
Software Dependencies	No	The paper mentions software like Adagrad but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For this example we used N = 50 parallel chains and ﬁxed the computational budget to 45000 samples. For Algorithm 2, the computational budget was divided equally over 150 scans, meaning 300 samples were used for every gradient update. The gradient updates were performed using Adagrad (Duchi et al., 2011) with learning rate equal to 0.2. In this experiment we used N = 35 chains and ﬁxed the computational budget to 50000 samples, divided into 500 scans using 100 samples each. We optimized the path using Adagrad with a learning rate of 0.3. The number of chains N is set to increase with dimension at the rate N = 15 d . We ﬁxed the number of spline knots K to 4 and set the computational budget to 50000 samples divided into 500 scans with 100 samples per gradient update. The gradient updates were performed using Adagrad with learning rate equal to 0.2. For all the experiments we performed one local exploration step before each communication step.