Hyperbolic Procrustes Analysis Using Riemannian Geometry
Authors: Ya-Wei Eileen Lin, Yuval Kluger, Ronen Talmon
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The efficacy of HPA, its theoretical properties, stability and computational efficiency are demonstrated in simulations. In addition, we showcase its performance on three batch correction tasks involving gene expression and mass cytometry data. Specifically, we demonstrate high-quality unsupervised batch effect removal from data acquired at different sites and with different technologies that outperforms recent methods for label-free alignment in hyperbolic spaces. |
| Researcher Affiliation | Academia | Viterbi Faculty of Electrical and Computer Engineering, Technion Program in Applied Mathematics, Yale University Interdepartmental Program in Computational Biology and Bioinformatics, Yale University Department of Pathology, Yale University |
| Pseudocode | Yes | Algorithm 1 Hyperbolic Procrustes analysis |
| Open Source Code | Yes | Our code is available at https://github.com/RonenTalmonLab/HyperbolicProcrustesAnalysis. |
| Open Datasets | Yes | We consider two publicly available datasets: METABRIC [8] and TCGA [26], consisting of samples from five breast cancer subtypes. In the second task, three cohorts of lung cancer (LC) gene expression data [21] are considered... The last task involves Cy TOF data [48] |
| Dataset Splits | Yes | We evaluate the quality of the alignment in two aspects using objective measures: (i) k-NN classification, with leave-one-batch-out cross-validation, is utilized for assessing the alignment of the intrinsic structure, and (ii) MMD [19] is used for assessing the distribution alignment quality. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory specifications). |
| Software Dependencies | No | The paper mentions software components like 'Python' and the 'POT library [12]', but it does not specify version numbers for any of these components. |
| Experiment Setup | Yes | The synthetic data in Ld is generated using the sampling scheme described in Section 2 based on [39]. Given an arbitrary point µ 2 Ld and an arbitrary SPD matrix Σ 2 Rd×d, we generate a set of N points Q(1) = {q(1)i }N i=1 centered at µ by Ld ∋ q(1)i = Expµ(PTµ0!µ(˜v(1)i )), where µ0 = [1, 0]> is the origin, v(1)i = [0, ˜v(1)i ]>, and ˜v(1)i ∼ N(0, Σ). We apply Algorithm 1 to align the three pairs of sets {Q(1), Q(2)}, {Q(1), Q(3)}, and {Q(1), Q(4)}, setting N = 100, σ = 1, and d ∈ {3, 5, 10, 20, . . . , 40}. Each experiment is repeated 10 times with different values of µ, Σ, µ0 and t. |