reproducibilityindex.ai

Differentiable Mapper for Topological Optimization of Data Representation

Authors: Ziyad Oulhaj, Mathieu Carrière, Bertrand Michel

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement and showcase the efficiency of Mapper filter optimization through Soft Mapper on various data sets... We present applications on 3D shape data in Section 7.1 and on single-cell RNA sequencing data in Section 7.2.
Researcher Affiliation	Academia	1Nantes Universit e, Ecole Centrale Nantes, Laboratoire de Math ematiques Jean Leray, CNRS UMR 6629, Nantes, France 2Data Shape, Centre Inria d Universit e Cˆote d Azur, Sophia Antipolis, France.
Pseudocode	Yes	Algorithm 1 Soft Mapper Optimization Algorithm Require: Initial parameter set θ0, Number of Monte Carlo random samples M, Learning rate sequence (αi)i, Random noise sequence (ξi)i, Number of epochs N. for 0 ≤ i < N do for 1 ≤ m ≤ M do e ∼ sample from Pθi yi,m ← an element of the sub-differential in θi of Le : θ 7 L(e, fθ) end for yi ← 1/M PM m=1 yi,m θi+1 ← θi − αi(yi + ξi) end for return θN
Open Source Code	Yes	We implement and showcase the efficiency of Mapper filter optimization through Soft Mapper on various data sets, with public, open-source code in TensorFlow. Our code is available at (Oulhaj, 2024). Oulhaj, Z. Mapper filter optimization. https://github. com/Ziyad Oulhaj/Mapper-Optimization, 2024.
Open Datasets	Yes	We now apply Mapper optimization on the human preimplantation dataset of (Petropoulos et al., 2016), which can also be found in the tutorial of the sc TDA Python library. The dataset can be accessed in the following link (sct), and it contains the expression levels for p = 26, 270 genes for each individual cell. sctda. https://github.com/Camara Lab/sc TDA. Accessed: 2024-01-23.
Dataset Splits	No	The paper mentions using a 'randomly sampled subset' for a heuristic and discusses 'training dataset' implicitly, but does not provide specific percentages, counts, or methodology for training/validation/test splits for reproducibility.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'TensorFlow', 'sc TDA Python library', and 'Seurat package in R' but does not specify their version numbers for reproducibility.
Experiment Setup	Yes	The parametric family of functions is linear, i.e., equal to {fθ : x 7 x, θ , θ R3}, and the cover assignment scheme Aδ is the smooth relaxation of the standard case, with δ = 10−2 (maxx∈Xn fθ(x) − minx∈Xn fθ(x))... The values of r (also called resolution), g (also called gain) and the number of clusters in the KMeans algorithm, for each 3-dimensional shape, are summarized in Appendix G. To find θ, we use the opposite of the L1 total (regular) persistence as a persistence specific loss ℓand we run Algorithm 1 with N = 200 and M = 10, each time taking the diagonal as the initial direction, i.e. θ0 = (1/3, 1/3, 1/3)T.