A Class of Topological Pseudodistances for Fast Comparison of Persistence Diagrams
Authors: Rolando Kindelan Nuñez, Mircea Petrache, Mauricio Cerda, Nancy Hitschfeld
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also experimentally verify that ETDs outperform PSs in terms of accuracy and outperform Wasserstein and Sliced Wasserstein distances in terms of computational complexity. We test our ETDs for classification applications and experimentally compare to classical methods in terms of accuracy and of computation time. We thus perform a few experiments, for comparing ETD versus state-of-art metrics based on PS, WD and SWD, to find evidence for the above two points in typical ML tasks that use PD information as an input. We summarize in Table 3 a wall-clock comparison between the same metrics as in Table 1, in two applications to PDs coming from the ML pipelines, and we compare accuracy for the tasks in Table 2, and Figure 2 below. |
| Researcher Affiliation | Academia | Rolando Kindelan Nu nez1, Mircea Petrache2, Mauricio Cerda1*, Nancy Hitschfeld1* 1Universidad de Chile 2UC Chile |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | Recall that in (Ali et al. 2023) they conduct supervised learning experiments on image classification datasets: the Outex texture database (Ojala et al. 2002), the SHREC14 shape retrieval dataset (Pickup et al. 2016), and the Fashion-MNIST database (Xiao, Rasul, and Vollgraf 2017). |
| Dataset Splits | No | The paper mentions reducing classes and samples for the Outex dataset, but does not provide specific training, validation, or test split percentages or counts for any dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper mentions several libraries and frameworks (Scikit-tda library(Saul and Tralie 2019), Python Optimal Transport (Pot WD) (Flamary et al. 2021), Gudhi Library (Maria et al. 2014), Scikit-learn (Pedregosa et al. 2012)), but does not specify their version numbers. |
| Experiment Setup | Yes | We conduct a Repeated Randomized Search (Bergstra and Bengio 2012) to determine the best k and weight hyperparameters for a k-Nearest Neighbors classifier on each distance matrices. We optimize over choices of k 9, and optimal values of k are shown in the second column of Table 2. For w we tried two possible choices: w(xq, xi) = 1 (uniform) or w(xq, xi) = 1 d(xq,xi) (distance), and as shown in the last column in Table 2, in all cases, for optimum k the optimum choice of w was the latter. |