Persistent Homology for High-dimensional Data Based on Spectral Methods
Authors: Sebastian Damrich, Philipp Berens, Dmitry Kobak
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply these methods to high-dimensional single-cell RNA-sequencing data and show that spectral distances allow robust detection of cell cycle loops. ... 3. a synthetic benchmark, with spectral distances outperforming state-of-the-art alternatives; 4. an application to a range of single-cell RNA-sequencing datasets with ground-truth cycles. |
| Researcher Affiliation | Academia | Hertie Institute for AI in Brain Health, University of Tübingen, Germany Tübingen AI Center, Germany IWR, Heidelberg University, Germany |
| Pseudocode | No | The paper describes algorithms and methods in textual form but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/berenslab/eff-ph/tree/neurips2024. |
| Open Datasets | Yes | The Malaria dataset [43]... We obtained the pre-processed data from https://github.com/vhowick/Malaria_Cell_Atlas/raw/v1.0/Expression_Matrices/Smartseq2/SS2_tmmlogcounts.csv.zip. ... The Neural IPC dataset [8]... shared this representation with us for a superset of 297 927 telencephalic exitatory cells and allowed us to share it with this paper (MIT License). ... The Neurosphere dataset [89]... The GO PCA representation was downloaded from https://zenodo.org/record/5519841/files/neurosphere.qs. ... The Hippocampus dataset [89]... The GO PCA representation was downloaded from https://zenodo.org/record/5519841/files/hipp.qs. ... The He La2 dataset [72, 89]... The GO PCA representation was downloaded from https://zenodo.org/record/5519841/files/HeLa2.qs. ... The Pancreas dataset [3, 89]... The GO PCA representation was downloaded from https://zenodo.org/record/5519841/files/endo.qs. |
| Dataset Splits | No | The paper describes synthetic data generation and sampling from real-world single-cell datasets, but does not specify explicit training, validation, or test dataset splits. It evaluates performance directly on the sampled data. |
| Hardware Specification | Yes | Our experiments were run on a machine with an Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz with 64 kernels, 377GB memory, and an NVIDIA RTX A6000 GPU. |
| Software Dependencies | Yes | We computed persistent homology using the ripser [4] project s representative-cycles branch at commit 140670f to compute persistent homologies and representative cycles. ... To compute k NN graphs, we used the Py Keops package [13]. The rest of our implementation is in Python. |
| Experiment Setup | Yes | All methods come with hyperparameters. We report the results for the best hyperparameter setting on each dataset (Appendix K) but found spectral methods to be robust to these choices (Appendix L). ... Appendix I. Details on the distances used in our benchmark. |