SpaceMAP: Visualizing High-Dimensional Data by Space Expansion

Authors: Xinrui Zu, Qian Tao

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated Space MAP on a range of synthetic and real datasets with varying manifold properties, and demonstrated its excellent performance in comparison with classical and state-of-the-art DR methods. In particular, the concept of space expansion provides a generic framework for understanding nonlinear DR methods including the popular t-distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP).
Researcher Affiliation Academia Xinrui Zu 1 Qian Tao 1 ... 1Department of Imaging Physics, Delft University of Technology. Correspondence to: Xinrui Zu <X.Zu1@tudelft.nl>, Qian Tao <Q.Tao@tudelft.nl>.
Pseudocode Yes Algorithm 1 describes the maximum likelihood estimation (MLE) of the intrinsic dimensions.
Open Source Code No The paper does not contain an explicit statement about releasing the source code for the Space MAP methodology, nor does it provide a link to a code repository.
Open Datasets Yes Experiments were performed on a wide range of datasets, including the standard MNIST (Le Cun, 1998), Fashion-MNIST (Xiao et al., 2017), Swiss roll 1 (on the surface we dig a hole to test if visualization methods can preserve the local property on a continuous manifold), Swiss roll 2 (consisting of parallel lines to test the hierarchical manifold assumption), COIL-20 (Nene et al., 1996), RNA-seq (Tasic et al., 2018).
Dataset Splits Yes For quantitative evaluation, we computed the 20-fold crossvalidated KNN classification accuracy, trustworthiness, continuity, Shepard goodness, and normalized stress to evaluate both local and global structure preservation (Espadoto et al., 2021; Nonato & Aupetit, 2019).
Hardware Specification Yes We implemented all the DR methods on a Ubuntu 20.04 LTS workstation platform with AMD 3900x 4.2GHz 12-core CPU, 64GB DDR4 RAM and NVidia RTX 3090 24GB GPU.
Software Dependencies No The paper mentions using 'Scikit-learn Barnes-Hut t-SNE implementation' and 'umap-learn implementation' but does not specify their version numbers or any other software dependencies with version details.
Experiment Setup Yes Two numbers are empirically set: knear = 20, and kmiddle = 50. ... For t-SNE, we use Scikit-learn Barnes-Hut t-SNE implementation with PCA initialization and the default perplexity is 30. For UMAP, we use the original umap-learn implementation with spectral embedding initialization, the default number of neighbors for each point is n-neighbors=15.