Topological Point Cloud Clustering

Authors: Vincent Peter Grande, Michael T Schaub

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test the performance of TPCC on both synthetic and real-world data and compare it with classical spectral clustering. ... Finally, we verify the accuracy of topological point cloud clustering on a number of synthetic and real-world data and compare it with other approaches on data sets from the literature.
Researcher Affiliation Academia 1Department of Computer Science, RWTH Aachen University, Aachen, Germany. Correspondence to: Vincent P. Grande <grande@cs.rwth-aachen.de>, Michael T. Schaub <schaub@cs.rwth-aachen.de>.
Pseudocode Yes A pseudocode version can be found in Algorithm 1. ... Algorithm 1 Topological Point Cloud Clustering (TPCC)
Open Source Code Yes Code of our implementation to reproduce the experimental results will be made available in the supplementary material.
Open Datasets No The paper uses synthetic data (e.g., “2 spheres, 2 circles”, “Toy example”, “Sphere in circle”) which is generated by the authors, and real-world data (e.g., “NALCN channelosome”, “Energy landscape of cyclo-octane”). However, it does not provide concrete access information (specific link, DOI, repository name, or formal citation with authors/year) for a publicly available or open dataset that was used for training or evaluation.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or citations to predefined splits) for training, validation, and testing needed to reproduce the data partitioning. It discusses
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No Software used We implemented the algorithm in python. We use the Gudhi library (The GUDHI Project, 2015) for all topology-related computations and operations. For general arithmetic and clustering purposes we use Num Py (Harris et al., 2020), scikit-learn (Pedregosa et al., 2011), and ARPACK (Lehoucq et al., 1998). For subspace clustering, we use Di SC (Zografos et al., 2013). While software is listed, specific version numbers for the libraries are not provided (e.g., “Num Py 1.x.x”); only publication years are given for the citations.
Experiment Setup Yes TPCC needs two main parameters, ε and d. ... We have added i.i.d. Gaussian noise with varying standard deviation specified by the parameter noise on all three coordinates of every point. (Figure 7 caption). ... We sample the point cloud by first taking 5000 points in a grid on each of the tori. We then randomly forget 20% of the points in order to simulate noise. The tori are connected by two straight lines, from which we each sample 300 points uniformly at random. (Appendix B)