Probabilistic Path Hamiltonian Monte Carlo

Authors: Vu Dinh, Arman Bilge, Cheng Zhang, Frederick A. Matsen IV

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we demonstrate the validity and efficiency of our PPHMC method by an application to Bayesian phylogenetic inference. We compared our PPHMC implementations to industry-standard Mr Bayes 3.2.5, which uses MCMC to sample phylogenetic trees (Ronquist et al., 2012). We first tested our PPHMC method on a simulated data set.
Researcher Affiliation Academia 1Program in Computational Biology, Fred Hutchison Cancer Research Center, Seattle, WA, USA 2Department of Statistics, University of Washington, Seattle, WA, USA.
Pseudocode Yes Algorithm 1 Leap-prog algorithm with step size ϵ. Algorithm 2 Refractive Leap-prog with surrogate
Open Source Code Yes We validate the algorithm through two independent implementations in open-source software: 1. a Scala version available at https://github.com/armanbilge/phylo HMC that uses the Phylogenetic Likelihood Library1 (Flouri et al., 2015), and 2. a Python version available at https://github. com/zcrabbit/Phylo Infer that uses the ETE toolkit (Huerta-Cepas et al., 2016) and Biopython (Cock et al., 2009).
Open Datasets Yes As a proof of concept, we first tested our PPHMC method on a simulated data set. We used a random unrooted tree with N = 50 leaves sampled from the aforementioned prior. 1000 nucleotide observations for each leaf were then generated by simulating the continuous-time Markov model along the tree. We also analyzed an empirical data set labeled DS4 by Whidden and Matsen (2015) that has become a standard benchmark for MCMC algorithms for Bayesian phylogenetics since Lakner et al. (2008).
Dataset Splits No The paper mentions burn-in periods for MCMC runs ('burn-in period of the first 25% iterations', 'burn-in of 25%'), but does not specify training, validation, or test dataset splits in terms of percentages or sample counts for model development or evaluation.
Hardware Specification No The paper does not specify any particular hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies Yes We compared our PPHMC implementations to industry-standard Mr Bayes 3.2.5, which uses MCMC to sample phylogenetic trees (Ronquist et al., 2012). a Scala version available at https://github.com/armanbilge/phylo HMC that uses the Phylogenetic Likelihood Library1 (Flouri et al., 2015), and 2. a Python version available at https://github. com/zcrabbit/Phylo Infer that uses the ETE toolkit (Huerta-Cepas et al., 2016) and Biopython (Cock et al., 2009).
Experiment Setup Yes For PPHMC, we set the step size ϵ = 0.0015 and smoothing threshold δ = 0.003 to give an overall acceptance rate of about α = 0.68 and set the number of leap-prog steps T = 200.