reproducibilityindex.ai

On UMAP's True Loss Function

Authors: Sebastian Damrich, Fred A. Hamprecht

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We corroborate our theoretical ﬁndings on toy and single cell RNA sequencing data.
Researcher Affiliation	Academia	Sebastian Damrich Fred A. Hamprecht HCI/IWR at Heidelberg University, 69120 Heidelberg, Germany {sebastian.damrich, fred.hamprecht}@iwr.uni-heidelberg.de
Pseudocode	Yes	Algorithm 1: UMAP s optimization
Open Source Code	Yes	Our code is publicly available at https://github.com/hci-unihd/UMAPs-true-loss.
Open Datasets	Yes	We illustrate our analysis on gene expression measurements of 86024 cells of C. elegans [16, 14]. We start out with a 100 dimensional PCA of the data obtained from http://cb.csail.mit.edu/cb/densvis/datasets/. We informed the authors of our use of the dataset, which they license under CC BY-NC 2.0.
Dataset Splits	No	The paper does not explicitly provide training/validation/test dataset splits with percentages, sample counts, or citations to predefined splits.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies	No	The paper mentions a GitHub repository for their code but does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	We start out with a 100 dimensional PCA of the data and use the cosine metric in high-dimensional space, consider k = 30 neighbors and optimize for 750 epochs, similar to [14].