reproducibilityindex.ai

Improved Variational Bayesian Phylogenetic Inference with Normalizing Flows

Authors: Cheng Zhang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that VBPI-NF signiﬁcantly improves upon the vanilla VBPI on a benchmark of challenging real data Bayesian phylogenetic inference problems. ... We performed experiments on 8 real datasets that are commonly used to benchmark Bayesian phylogenetic inference methods (Hedges et al., 1990; Garey et al., 1996; Yang and Yoder, 2003; Henk et al., 2003; Lakner et al., 2008; Zhang and Blackwell, 2001; Yoder and Yang, 2004; Rossman et al., 2001; Höhna and Drummond, 2012; Larget, 2013; Whidden and Matsen IV, 2015). ... Table 1 shows the estimates of the lower bounds (K = 1, 10) and the marginal likelihood from different variational approaches on the 8 benchmark datasets.
Researcher Affiliation	Academia	Cheng Zhang School of Mathematical Sciences and Center for Statistical Science Peking University, Beijing, China chengzhang@math.pku.edu.cn
Pseudocode	Yes	See algorithm 1 in the supplement for more details.
Open Source Code	Yes	The code is available at https://github.com/zcrabbit/vbpi-nf.
Open Datasets	Yes	We performed experiments on 8 real datasets that are commonly used to benchmark Bayesian phylogenetic inference methods (Hedges et al., 1990; Garey et al., 1996; Yang and Yoder, 2003; Henk et al., 2003; Lakner et al., 2008; Zhang and Blackwell, 2001; Yoder and Yang, 2004; Rossman et al., 2001; Höhna and Drummond, 2012; Larget, 2013; Whidden and Matsen IV, 2015). These datasets, which we will call DS1-8, consist of sequences from 27 to 64 eukaryote species with 378 to 2520 site observations (see Table 1 and Lakner et al. (2008)).
Dataset Splits	No	The paper describes using a multi-sample lower bound for optimization and evaluates performance on benchmark datasets, but it does not specify explicit train/validation/test dataset splits for the input sequence data in a way that allows reproduction of data partitioning.
Hardware Specification	No	The author is grateful for the computational resources provided by the High-performance Computing Platform of Peking University. This statement does not provide specific hardware details such as GPU/CPU models or memory.
Software Dependencies	No	All models were implemented in Pytorch (Paszke et al., 2019) with the Adam optimizer (Kingma and Ba, 2015). Specific version numbers for PyTorch or other libraries are not provided.
Experiment Setup	Yes	We set K = 10 for the multi-sample lower bound, with a schedule λn = min(1, 0.001 + n/100000), going from 0.001 to 1 after 100000 iterations. We evaluate the performance of our permutation equivariant normalizing ﬂows with varying numbers of layers. All models were implemented in Pytorch (Paszke et al., 2019) with the Adam optimizer (Kingma and Ba, 2015)... Results were collected after 400,000 parameter updates.