Metagenomic Binning using Connectivity-constrained Variational Autoencoders
Authors: Andre Lamurias, Alessandro Tibo, Katja Hose, Mads Albertsen, Thomas Dyhre Nielsen
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on both simulated and real-world datasets demonstrate that CCVAE outperforms current state-of-the-art techniques, thus providing a more effective method for metagenomic binning. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Aalborg University, Aalborg, Denmark 2NOVA LINCS, NOVA School of Science and Technology, Lisbon, Portugal 3Institute of Logic and Computation, TU Wien, Vienna, Austria 4Center for Microbial Communities, Aalborg University, Aalborg, Denmark. |
| Pseudocode | Yes | Algorithm 1 Training CCVAE |
| Open Source Code | Yes | The code and data used in the experiments are available at https: //github.com/Microbial Dark Matter/ccvae. |
| Open Datasets | Yes | We perform experiments on one simulated dataset and six Wastewater Treatment Plant (WWTP) datasets (Table 2). These are the same datasets used by Lamurias et al. (2022), where more details about data generation and processing can be found. |
| Dataset Splits | No | The paper discusses training the model and evaluating its performance, but it does not explicitly define or specify a separate validation dataset split. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or memory specifications). |
| Software Dependencies | No | The paper mentions general machine learning components like VAE, gradient descent, and Adam optimizer, but does not specify any software names with version numbers for reproducibility (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The VAE is trained using gradient descent for 1000 epochs with a learning rate of 1e 3. We used mini-batches of 256 edges and sampled 10 negative pairs from a uniform distribution. The loss coefficients were set to αe = 0.1 and αscg = 0.3, determined empirically through grid search on the Aale dataset. |