On UMAP's True Loss Function

Authors: Sebastian Damrich, Fred A. Hamprecht

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We corroborate our theoretical findings on toy and single cell RNA sequencing data.
Researcher Affiliation Academia Sebastian Damrich Fred A. Hamprecht HCI/IWR at Heidelberg University, 69120 Heidelberg, Germany {sebastian.damrich, fred.hamprecht}@iwr.uni-heidelberg.de
Pseudocode Yes Algorithm 1: UMAP s optimization
Open Source Code Yes Our code is publicly available at https://github.com/hci-unihd/UMAPs-true-loss.
Open Datasets Yes We illustrate our analysis on gene expression measurements of 86024 cells of C. elegans [16, 14]. We start out with a 100 dimensional PCA of the data obtained from http://cb.csail.mit.edu/cb/densvis/datasets/. We informed the authors of our use of the dataset, which they license under CC BY-NC 2.0.
Dataset Splits No The paper does not explicitly provide training/validation/test dataset splits with percentages, sample counts, or citations to predefined splits.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies No The paper mentions a GitHub repository for their code but does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes We start out with a 100 dimensional PCA of the data and use the cosine metric in high-dimensional space, consider k = 30 neighbors and optimize for 750 epochs, similar to [14].