Finding Most Compatible Phylogenetic Trees over Multi-State Characters
Authors: Tuukka Korhonen, Matti Järvisalo1544-1551
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically our approach outperforms in scalability the earlier proposed approaches w.r.t. various parameters underlying the problem. We empirically evaluate the performance of our hybrid BT-Max SAT approach. |
| Researcher Affiliation | Academia | Tuukka Korhonen, Matti J arvisalo HIIT, Department of Computer Science, University of Helsinki, Finland |
| Pseudocode | No | The paper describes algorithms and methods but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our implementation, all test data and detailed results are available via https://github.com/Laakeri/ phylogeny-aaai. |
| Open Datasets | Yes | We generated a comprehensive set of benchmarks using the ms data generator (Hudson 2002) employed in various empirical evaluations of algorithms for maximum compatibility and perfect phylogeny (Gusfield, Frid, and Brown 2007; Gusfield 2010; Stevens and Gusfield 2010; Gysel and Gusfield 2011; Gysel, Gusfield, and Stevens 2013; Coulombe, Stevens, and Gusfield 2015) extended to multistate instances as described in (Gusfield 2010). |
| Dataset Splits | No | The paper describes how instances were generated and parameters varied, but does not specify explicit training, validation, and test dataset splits with percentages or counts. |
| Hardware Specification | Yes | The experiments were run single-threaded on computers with 2.4-GHz Intel Xeon E5-2680-v4 processors with a per-instance 2-hour time limit and 32-GB memory limit. |
| Software Dependencies | Yes | For the PBO approach, we used the code provided by the authors, in particular the PBO-RR variant which includes additional constraints to strengthen the PBO formulation, which we observed to be the best-performing variant in preliminary experiments. For the experiments on PBO, we used Minisat+ (E en and Sorensson 2006) as the PBO solver. For both of the IP-based approaches, we used CPLEX 12.7.1 (IBM ILOG 2017) as the IP solver in the experiments. We implemented our BT-Max SAT approach based on the Triangulator implementation of BT (Korhonen, Berg, and J arvisalo 2019), and used Max HS (Davies and Bacchus 2013) as the Max SAT solver. |
| Experiment Setup | Yes | We generated a comprehensive set of benchmarks using the ms data generator (Hudson 2002) employed in various empirical evaluations of algorithms for maximum compatibility and perfect phylogeny (Gusfield, Frid, and Brown 2007; Gusfield 2010; Stevens and Gusfield 2010; Gysel and Gusfield 2011; Gysel, Gusfield, and Stevens 2013; Coulombe, Stevens, and Gusfield 2015) extended to multistate instances as described in (Gusfield 2010). The generator generates instances based on four parameters: n, the number of taxa; m, the number of characters; k, the number of states for each character; and r, the recombination parameter that controls how far the data is from admitting a perfect phylogeny. Following Gusfield (2010) we set n = m. For the parameter k, the values 4 and 20 are biologically relevant (nucleotide and amino acid) (Wiley and Lieberman 2011). Setting r = 0 guarantees a perfect phylogeny; we let r vary from 0 to 4. |