Finer Metagenomic Reconstruction via Biodiversity Optimization
Authors: Simon Foucart, David Koslicki
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 Numerical Experiments, The purpose here is a proof-of-concept, Figure 1 displays the support size of uniformly randomly (normalized) vectors x versus the percentage of successful recoveries by each algorithm averaged over 200 replicates, Figure 3 demonstrates that when a high percentage of vectors are recovered, the procedure (IRWLP) takes less execution time than Quikr. |
| Researcher Affiliation | Academia | Simon Foucart Department of Mathematics Texas A&M University College Station, TX 77843 foucart@tamu.edu, David Koslicki Departments of Computer Science and Engineering, Biology, and the Huck Institutes of the Life Sciences Pennsylvania State University University Park, PA 16802 dmk333@psu.edu |
| Pseudocode | No | The paper describes mathematical formulations and optimization problems, such as (Min Div) and (IRWLP), but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All numerical experiments can be reproduced via the Git Hub repository: https://github.com/dkoslicki/Minimize Biological Diversity |
| Open Datasets | Yes | we utilized the Green Genes 97% OTU database [7] where [7] is T. Z. De Santis, P. Hugenholtz, N. Larsen, M. Rojas, E. L. Brodie, K. Keller, T. Huber, D. Dalevi, P. Hu, and G. L. Andersen. Greengenes, a chimera-checked 16S r RNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol., 72(7):5069 5072, 2006. |
| Dataset Splits | No | The paper describes running 200 replicates or simulations and defines success criteria, but it does not specify explicit training/validation/test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware (e.g., GPU models, CPU types, memory details) used for running the experiments. |
| Software Dependencies | Yes | We utilized MATLAB s fmincon nonlinear optimizer [19] with the sqp algorithm to solve the optimization (Min Div). and [19] The Math Works, Inc. MATLAB and statistics toolbox release 2019a. Natick, Massachusetts, United States. |
| Experiment Setup | Yes | In equation (IRWLP), we set q = 0.01 and ε = 10 5 and terminated the iterative procedure if the change in ℓ1 norm was less than 10 3 or if the number of iterations exceeded 25. For the Quikr optimization procedure, we set λ = 10,000. We used k = 3 to form a 64 192 k-mer matrix A. we selected k = 4 to form a 256 768 k-mer matrix A. we considered the cases when h = 4, 6, and 13 and set q = 0.01 in each of them. |