Finer Metagenomic Reconstruction via Biodiversity Optimization

Authors: Simon Foucart, David Koslicki

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 Numerical Experiments, The purpose here is a proof-of-concept, Figure 1 displays the support size of uniformly randomly (normalized) vectors x versus the percentage of successful recoveries by each algorithm averaged over 200 replicates, Figure 3 demonstrates that when a high percentage of vectors are recovered, the procedure (IRWLP) takes less execution time than Quikr.
Researcher Affiliation Academia Simon Foucart Department of Mathematics Texas A&M University College Station, TX 77843 foucart@tamu.edu, David Koslicki Departments of Computer Science and Engineering, Biology, and the Huck Institutes of the Life Sciences Pennsylvania State University University Park, PA 16802 dmk333@psu.edu
Pseudocode No The paper describes mathematical formulations and optimization problems, such as (Min Div) and (IRWLP), but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes All numerical experiments can be reproduced via the Git Hub repository: https://github.com/dkoslicki/Minimize Biological Diversity
Open Datasets Yes we utilized the Green Genes 97% OTU database [7] where [7] is T. Z. De Santis, P. Hugenholtz, N. Larsen, M. Rojas, E. L. Brodie, K. Keller, T. Huber, D. Dalevi, P. Hu, and G. L. Andersen. Greengenes, a chimera-checked 16S r RNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol., 72(7):5069 5072, 2006.
Dataset Splits No The paper describes running 200 replicates or simulations and defines success criteria, but it does not specify explicit training/validation/test dataset splits (e.g., percentages or sample counts).
Hardware Specification No The paper does not explicitly mention any specific hardware (e.g., GPU models, CPU types, memory details) used for running the experiments.
Software Dependencies Yes We utilized MATLAB s fmincon nonlinear optimizer [19] with the sqp algorithm to solve the optimization (Min Div). and [19] The Math Works, Inc. MATLAB and statistics toolbox release 2019a. Natick, Massachusetts, United States.
Experiment Setup Yes In equation (IRWLP), we set q = 0.01 and ε = 10 5 and terminated the iterative procedure if the change in ℓ1 norm was less than 10 3 or if the number of iterations exceeded 25. For the Quikr optimization procedure, we set λ = 10,000. We used k = 3 to form a 64 192 k-mer matrix A. we selected k = 4 to form a 256 768 k-mer matrix A. we considered the cases when h = 4, 6, and 13 and set q = 0.01 in each of them.