Diffeomorphic interpolation for efficient persistence-based topological optimization

Authors: Mathieu Carrière, Marc Theveneau, Théo Lacombe

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide numerical evidence for the strength of our diffeomorphic interpolations. The first two experiments were run on a 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz, the last one on a 2x Xeon SP Gold 5115 @ 2.40GHz. Our code is publicly available at https: //github.com/tlacombe/topt.
Researcher Affiliation Academia Mathieu Carrière Data Shape Centre Inria d Université Côte d Azur Sophia Antipolis, France mathieu.carriere@inria.fr Marc Theveneau Shape Analysis Group Computer Science department, Mc Gill Montréal, Quebec, Canada marc.theveneau@mail.mcgill.ca Théo Lacombe Laboratoire d Informatique Gaspard Monge, Univ. Gustave Eiffel, CNRS, LIGM, F-77454 Marne-la-Vallée, France theo.lacombe@univ-eiffel.fr
Pseudocode Yes Algorithm 1 Diffeomorphic gradient descent for topological loss functions with subsampling
Open Source Code Yes Our code is publicly available at https: //github.com/tlacombe/topt.
Open Datasets Yes For this, we consider the vertices of the Stanford Bunny [41], yielding a point cloud X0 Rn d with n = 35, 947 and d = 3. We consider a topological augmentation loss (see Section 2.1) for two-dimensional topological features, i.e., we aim at increasing the persistence of the bunny s cavity. The size of n makes the computation of Dgm(X0) untractable (recall that it scales in O(n4)); we thus rely on subsampling with sample size s = 100 and compare the vanilla gradient descent scheme and our scheme described in Algorithm 1. [...] trained a variational autoencoder (VAE) to project a family of datasets of images representing rotating objects, named COIL [32], to two-dimensional latent spaces. [...] Specifically, we processed this single cell dataset of 1, 171 cells with the stratum-adjusted correlation coefficient (SCC) with 500kb and convolution parameter h = 1 on chromosome 10. Then, we run kernel PCA on the SCC matrix to obtain a preprocessed dataset in R100, on which we applied the same VAE architecture than the one described above for COIL images. Finally, we optimized two losses, the first was the same augmentation loss than for the COIL images, the second was the following registration loss: L : X Rn d 7 W 2(Dgm1(X), Dtarget), where Dgm(X) contains the points of the PD of X with distance-to-diagonal at least τ = 1, Dtarget is a target PD with only one point p = [ 3.5, 3.5], and W is the 2-Wasserstein distance between PDs. We used σ = 0.2 for the augmentation loss and σ = 0.025 for the registration loss (σ is set to a smaller value for the registration loss in order to mitigate the effects of matching instability), λ = 0.1 on 500 epochs, with subsampling of size s = 300 for computational efficiency (as computing VR diagrams without radius thresholding on 1, 171 points already takes few minutes on a laptop CPU, which becomes hardly tractable if done repetitively as in gradient descent), and loss increase of 3. as a stopping criterion. Qualitative results are displayed in Appendix C, Figure 11, and we also measured quantitative improvement with the correlation scores between latent space angles and repli scores7 in Table 1, in which improvements can be observed.
Dataset Splits No The paper states: 'compute a validation loss, i.e., sample X t,(1), . . . , X t,(K) and estimate ˆL by K 1 PK k=1 ℓ(Dgm(X t,(k)))'. While it discusses computing a validation loss, it does not provide explicit details on how the datasets (e.g., COIL, sc Hi C) were split into train, validation, and test sets with percentages or sample counts. For instance, for COIL/sc Hi C, it mentions generating '100 test sets obtained by randomly perturbing the training set', which is not a standard train/validation/test split.
Hardware Specification Yes The first two experiments were run on a 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz, the last one on a 2x Xeon SP Gold 5115 @ 2.40GHz.
Software Dependencies No PHrelated computations relies on the library Gudhi [40] and automatic differentiation relies on tensorflow [1]. The paper mentions these software packages but does not provide specific version numbers for them.
Experiment Setup Yes We sample uniformly n = 200 points on a unit circle in R2 with some additional Gaussian noise, and then minimize the simplification loss L : X 7 P (b,d) Dgm(X) |d|2, which attempts to destroy the underlying topology in X by reducing the death times of the loops by collapsing the points. The respective gradient descents are iterated over a maximum of 250 epochs, possibly interrupted before if a loss of 0 is reached (Dgm(X) is empty), with a same learning rate λ = 0.1. The bandwidth of the Gaussian kernel in (4) is set to σ = 0.1. [...] we use learning rate λ = 0.1, Gaussian kernels with bandwidth σ = 0.3 and an increase of at least 3. in the topological loss (from an iteration to the next) to stop the algorithm. [...] We used σ = 0.2 for the augmentation loss and σ = 0.025 for the registration loss (σ is set to a smaller value for the registration loss in order to mitigate the effects of matching instability), λ = 0.1 on 500 epochs, with subsampling of size s = 300 for computational efficiency (as computing VR diagrams without radius thresholding on 1, 171 points already takes few minutes on a laptop CPU, which becomes hardly tractable if done repetitively as in gradient descent), and loss increase of 3. as a stopping criterion.