Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders

Authors: Tianyu Xie, David Harry Tyensoung Richman, Jiansi Gao, Frederick A Matsen, Cheng Zhang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate Phylo VAE s robust representation learning capabilities and fast generation of phylogenetic tree topologies.
Researcher Affiliation	Academia	Tianyu Xie1, Harry Richman3, Jiansi Gao3, Frederick A. Matsen IV3,4, Cheng Zhang1,2, 1 School of Mathematical Sciences, Peking University 2 Center for Statistical Science, Peking University 3 Computational Biology Program, Fred Hutchinson Cancer Research Center 4 Howard Hughes Medical Institute EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: A linear-time algorithm for tree topology encoding
Open Source Code	Yes	Our code is released at https://github.com/tyuxie/PhyloVAE.
Open Datasets	Yes	Following Hillis et al. (2005), we select five genes and the ground truth phylogenetic tree (Figure 9; 44 leaves) from the early placental mammal evolution analysis in Murphy et al. (2001). The sequence alignment under consideration comprises 290 rabies genomes (Viana et al., 2023). Finally, we assess the generative modeling performance of Phylo VAE on eight benchmark sequence sets, DS1-8, which contain biological sequences from 27 to 64 eukaryote species and are commonly considered for benchmarking tree topology density estimation and Bayesian phylogenetic inference tasks in previous works (Zhang & Matsen IV, 2018; 2019; 2024; Zhang, 2020; Mimori & Hamada, 2023; Zhou et al., 2023; Xie & Zhang, 2023; Xie et al., 2024a;b; Molén et al., 2024; Hotti et al., 2024).
Dataset Splits	No	For each gene, we simulate the DNA sequences with a fixed length along the ground truth tree using the corresponding evolutionary model, run a Mr Bayes chain (Ronquist et al., 2012) for one million iterations, and sample per 100 iterations in the last 100,000 iterations, to gather the posterior samples, as done in Hillis et al. (2005). These one million iterations are enough for the Mr Bayes run to converge. These 5,000 tree topologies with uniform weights constitute the training set of Phylo VAE. ... (i) for each sequence set, there are 10 replicate training sets of tree topologies which are gathered from 10 independent Mr Bayes runs until the runs have ASDSF (the standard convergence criteria used in Mr Bayes) less than 0.01 or a maximum of 100 million iterations (tree topologies are sampled every 100 iterations with the first 25% iterations discarded); (ii) for each sequence set, the ground truth of tree topologies is gathered from 10 single-chain Mr Bayes for one billion iterations (tree topologies are sampled every 1000 iterations with the first 25% iterations discarded).
Hardware Specification	Yes	The experiments are run on a single 2.4 GHz CPU. ... The experiments are run on a single NVIDIA RTX 2080Ti GPU.
Software Dependencies	No	For all experiments, Phylo VAE is implemented in Py Torch (Paszke et al., 2019). The optimizer is Adam (Kingma & Ba, 2015) with parameters (β1, β2) = (0.9, 0.999) and weight_decay = 0.0.
Experiment Setup	Yes	The optimizer is Adam (Kingma & Ba, 2015) with parameters (β1, β2) = (0.9, 0.999) and weight_decay = 0.0. The results are collected after 200000 iterations with batch size B = 10. ... The dimension of the latent space is set to d = 2. The generative model is a three-layer MLP with 512 hidden units and a Res Net architecture. For the inference model, the number of message passing rounds is L = 2, and both MLPµ and MLPσ are composed of a two-layer MLP with 100 hidden units. The number of particles in the multi-sample lower bound (3) is K = 32. The learning rate is set to 0.0003 at the beginning and anneals according to a cosine schedule.