reproducibilityindex.ai

DeepCoDA: personalized interpretability for compositional health data

Authors: Thomas Quinn, Dang Nguyen, Santu Rana, Sunil Gupta, Svetha Venkatesh

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our architecture maintains state-of-the-art performance across 25 real-world data sets, all while producing interpretations that are both personalized and fully coherent for compositional data.
Researcher Affiliation	Academia	1Applied Artiﬁcial Intelligence Institute (A2I2), Deakin University, Geelong, Australia.
Pseudocode	No	The paper describes the network architecture and modules using equations and descriptive text, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Our implementation of Deep Co DA is available from http: //github.com/nphdang/Deep Co DA.
Open Datasets	Yes	The ﬁrst contains 13 data sets from (Quinn and Erb, 2020), curated to benchmark compositional data analysis methods for microbiome and similar data1. The second contains 12 data sets from (Vangay et al., 2019), curated to benchmark machine learning methods for microbiome data2. 1 Available from https://zenodo.org/record/3378099/ 2 Available from https://knights-lab.github.io/MLRepo/
Dataset Splits	Yes	We develop the model in two stages. First, we use a discovery set of 13 data sets to design the architecture and choose its hyper-parameters. Second, we use a veriﬁcation set of 12 unseen data sets to benchmark the ﬁnal model.
Hardware Specification	No	The paper does not provide specific hardware details like GPU models, CPU types, or memory specifications used for running experiments.
Software Dependencies	No	The paper refers to Python notebooks and uses various statistical and machine learning methods, but does not specify version numbers for any software dependencies or libraries.
Experiment Setup	Yes	Using a discovery set of 13 data sets, we trained models with B = [1, 3, 5, 10] log-bottlenecks and a λs = [0.001, 0.01, 0.1, 1] L1 penalty. Figure 3 shows the standardized performance for all discovery set models for each hyper-parameter combination. Here, we see that B = 5 and λs = 0.01 works well with or without self-explanation.