Kernel Methods for Radial Transformed Compositional Data with Many Zeros
Authors: Junyoung Park, Changwon Yoon, Cheolwoo Park, Jeongyoun Ahn
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The applicability of the proposed approach is demonstrated with kernel principal component analysis. We then provide a quantitative assessment of k PCA on new synthetic data and real-world data examples. |
| Researcher Affiliation | Academia | 1Department of Mathematical Sciences, KAIST, Daejeon, Korea 2Department of Industrial & Systems Engineering, KAIST, Daejeon, Korea. Correspondence to: Jeongyoun Ahn <jyahn@kaist.ac.kr>, Cheolwoo Park <parkcw2021@kaist.ac.kr>. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper references third-party R packages used (e.g., z Compositions, phyloseq, GUniFrac) but does not provide a direct link or explicit statement about the availability of the source code for the methodology developed in this paper. |
| Open Datasets | Yes | Table 3. Data availability for real data examples. Dataset Data Source Hayden et al. (2020) BONUS-CF (WGS) dataset from Microbiome DB.org Gimblet et al. (2017) Experimental cutaneous leishmaniasis dataset from Microbiome DB.org Arumugam et al. (2011) enterotype dataset in R package phyloseq Carrieri et al. (2021) Supplementary material of the referenced article Charlson et al. (2010) throat.otu.tab dataset in R package GUni Frac Schiffer et al. (2019) vaginal.otu.tab dataset in R package GUni Frac |
| Dataset Splits | No | The paper discusses the use of synthetic and real-world datasets and their sizes, but it does not specify explicit training/validation/test dataset splits with percentages, sample counts, or clear splitting methodology for reproducibility. |
| Hardware Specification | No | The paper mentions computational cost but does not specify any hardware (e.g., CPU, GPU models) used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'R package z Compositions' for producing results, and other R packages as data sources, but it does not specify version numbers for these software components or any other software dependencies with version numbers. |
| Experiment Setup | Yes | Figure 4 shows projection plots from kernel PCA using Gaussian kernel with two different values of the parameter γ, for both the radial transformed data and the clr transformed data. Here, γ indicates the parameter for Gaussian kernel. For the polynomial kernel, the degree p = 3 is used. The parameter γ ranges from 1 to 100 for the radial transformed data, and from 0.0001 to 0.01 for the clr transformed data. |