Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Cover learning for large-scale topology representation

Authors: Luis Scoccola, Uzu Lim, Heather A. Harrington

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide an implementation of Shape Discover (Scoccola & Lim, 2025), a cover learning algorithm based on our theory, and showcase it on two sets of experiments: a quantitative one on topological inference, and a qualitative one on large-scale topology visualization. In the first case, Shape Discover learns topologically correct simplicial complexes, on synthetic and real data, of smaller size than those obtained with previous topological inference approaches. In the second, we argue that Shape Discover represents the large-scale topology of real data better, and with more intuitive parameters, than previous TDA algorithms that fit the cover learning framework.
Researcher Affiliation Academia 1Centre de Recherches Math ematiques et Institut des sciences math ematiques, Laboratoire de combinatoire et d informatique math ematique de l Universit e du Qu ebec a Montr eal, Universit e de Sherbrooke, Canada. 2Queen Mary University of London, United Kingdom. 3Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany; Centre for Systems Biology, Dresden, Germany; Faculty of Mathematics, Technische Universit at Dresden, Germany; Mathematical Institute, University of Oxford, United Kingdom. Correspondence to: Luis Scoccola <EMAIL>.
Pseudocode Yes Algorithm 1 1D Mapper cover learning algorithm Input: Data X, function f : X R, clustering algo- rithm Cθ, parameter(s) θ for Cθ, cover {Ii}k i=1 of R Take pullback cover {f 1(Ii)}k i=1 of X Let Ui := Cθ(f 1(Ii)) for 1 i k Return The union Sk i=1 Ui Algorithm 2 Ball Mapper cover learning algorithm Input: Data X, ε > 0 Build an ε-net {yi}k i=1 of X Return The cover {B(yi, ε)}k i=1 Algorithm 3 Shape Discover fuzzy cover learning algorithm Input: Point cloud X RN Parameters: n cov N, n neigh N, reg > 0 Optimization parameters: lr, n epoch, p [1, ) G := Neighborhood Graph(X, n neigh) g := Fuzzy Cover Initialization(G, n cov) h := Parametric Partition Of Unity() θ := Initialize Parametric Model (h, g) L(θ) := b M + reg b G + b T + reg b R (πp hθ) θ := Gradient Descent(L, n epoch, lr, init = θ ) Return π hθ
Open Source Code Yes Our implementation of Shape Discover (Scoccola & Lim, 2025) is in Py Torch (Paszke et al., 2019), and we rely on Numpy (Harris et al., 2020), Scipy (Virtanen et al., 2020), Numba (Lam et al., 2015), Scikit-learn (Pedregosa et al., 2011), and Gudhi (The GUDHI Project, 2015). (...) Scoccola, L. and Lim, U. Shapediscover: Learning covers with geometric optimization. https://github.com/luisscoccola/shapediscover, 2025.
Open Datasets Yes The dataset is from (Lederman & Talmon, 2018)... The dataset is from (Gardner et al., 2022)... This is the dataset of (Alpaydin & Kaynak, 1998)... This is the classical dataset of (Deng, 2012)... This is the dataset of (Packer et al., 2019)...
Dataset Splits No We use the training data, which consists of 60000 handwritten digits encoded as vectors in 784 dimensions.
Hardware Specification Yes All experiments were run on a Mac Book Pro with Apple M1 Pro processor and 8GB of RAM.
Software Dependencies No Our implementation of Shape Discover (Scoccola & Lim, 2025) is in Py Torch (Paszke et al., 2019), and we rely on Numpy (Harris et al., 2020), Scipy (Virtanen et al., 2020), Numba (Lam et al., 2015), Scikit-learn (Pedregosa et al., 2011), and Gudhi (The GUDHI Project, 2015).
Experiment Setup Yes In all experiments we fix the following default parameters: number of neighbors for the neighborhood graph n neigh = 15, regularization parameter reg = 10, number of iterations for gradient descent n epoch = 500, learning rate for gradient descent lr = 0.1, and approximation parameter for fuzzy cover p = 5 (Section 4.4).