Robust Persistence Diagrams using Reproducing Kernels
Authors: Siddharth Vishwanath, Kenji Fukumizu, Satoshi Kuriki, Bharath K. Sriperumbudur
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments" and "We illustrate the performance of robust persistence diagrams in machine learning applications through synthetic and real-world experiments." Also, "Table 1: Runtime (in Seconds) for computing Dgm f n ρ,σ and Dgm (f n σ ) at each grid resolution." and "Table 2: Rand-index for spectral clustering using distance matrices for Dgm (fρ,σ) and Img (d Xn, h)." |
| Researcher Affiliation | Academia | Siddharth Vishwanath The Pennsylvania State University suv87@psu.edu", "Kenji Fukumizu The Institute of Statistical Mathematics fukumizu@ism.ac.jp", "Satoshi Kuriki The Institute of Statistical Mathematics kuriki@ism.ac.jp", "Bharath Sriperumbudur The Pennsylvania State University bks18@psu.edu |
| Pseudocode | No | The KIRWLS algorithm starts with initial weights {w(0) i }n i=1 such that Pn i=1 w(0) i = 1, and generates the iterative sequence of estimators {f (k) ρ,σ}k N as f (k) ρ,σ = i=1 w(k 1) i Kσ( , Xi) ; w(k) i = ϕ( Φσ(Xi) f (k) ρ,σ Hσ) Pn j=1 ϕ( Φσ(Xj) f (k) ρ,σ Hσ) ." (This describes an algorithm, but it is not presented as a clearly labeled 'Pseudocode' or 'Algorithm' block.) |
| Open Source Code | Yes | 1https://github.com/sidv23/robust-PDs" (footnote at the end of section 5). |
| Open Datasets | Yes | We perform a variant of the six-class benchmark experiment from [1, Section 6.1]." and "We examine the performance of persistence diagrams in a classification task on [28]." (referencing MPEG7). |
| Dataset Splits | No | The paper describes the datasets used (e.g., synthetic data, benchmark from [1], MPEG7 from [28]) and mentions training a linear SVM classifier, but does not explicitly provide specific train/validation/test dataset split percentages, sample counts, or references to predefined splits for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper does not specify versions for any software components, programming languages, or libraries used in the experiments (e.g., Python version, specific libraries like PyTorch, or specialized packages like CPLEX with version numbers). |
| Experiment Setup | Yes | In all the experiments, the kernel bandwidth σ is chosen as the median distance of each xi Xn to its kth nearest neighbour using the Gaussian kernel with the Hampel loss (similar setting as in [27]) we denote this bandwidth as σ(k). Since DTM is closely related to the k-NN density estimator [6], we choose the DTM smoothing parameter as m(k) = k/n. Additionally, the KIRWLS algorithm is run until the relative change of empirical risk < 10 6." and "The bandwidth σ(k) > 0 is chosen for k = 5." and "Dgm (d Xn) is transformed to the persistence image Img (d Xn, h) for h = 0.1." and "The smoothing parameters σ(k) and m(k) are chosen as earlier for k = 5. The persistence diagrams are normalized to have a max persistence max{|d b| = 1 : (b, d) Dgm(φ)}, and then vectorized as persistence images, Img (f n σ , h), Img f n ρ,σ, h , and Img (dn,m, h) for various bandwidths h. A linear SVM classifier is then trained on the resulting persistence images. |