Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

The Flood Complex: Large-Scale Persistent Homology on Millions of Points

Authors: Florian Graf, Paolo Pellizzoni, Martin Uray, Stefan Huber, Roland Kwitt

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the applicability and relevance of the Flood complex as follows: in Section 5.1, we use it to compute PH on point clouds for which existing approaches require an impractical amount of computational resources; in Section 5.2, we study the scalability of the Flood complex, and, in Section 5.3, we show that PH computed on the Flood complex (in short, Flood PH) improves predictions in downstream machine learning tasks compared to simpler approaches such as subsampling.
Researcher Affiliation	Academia	1University of Salzburg, Austria 2Max Planck Institute of Biochemistry, Germany 3Josef Ressel Centre for Intelligent and Secure Industrial Automation, University of Applied Sciences, Salzburg, Austria
Pseudocode	No	The paper describes implementation details in prose in Section A.1 "Implementation details of masking and flooding procedures" but does not provide a formal pseudocode block or algorithm steps.
Open Source Code	Yes	Source code and datasets are available on : https://github.com/plus-rkwitt/flooder. We provide the full source code for constructing the Flood complex with subsequent PH computation at https://github.com/plus-rkwitt/flooder
Open Datasets	Yes	Source code and datasets are available on : https://github.com/plus-rkwitt/flooder. In addition to the Flood complex implementation, the flooder package provides all point cloud datasets used for the object classification experiments in Section 5.3. These datasets are ready-to-use, with all pre-processing steps already applied and come with pre-defined splits to ensure reproducibility.
Dataset Splits	Yes	We use ten random 80/20% training/testing splits, with 10% of the training data reserved for validation.
Hardware Specification	Yes	The main experiments were run on an SUSE Linux Enterprise Server 15 SP6 system with AMD EPYC 9554 64-Core Processors, 1024 GB of main memory, and NVIDIA H100 80GB HBM3 GPUs.
Software Dependencies	Yes	2all experiments were run with flooder (v1.0rc5)
Experiment Setup	Yes	Unless otherwise stated, we select \|L\| = 2k landmarks from \|X\| = 1M points using FPS. To compute filtration values, we discretize each simplex based on an equally spaced grid of barycentric coordinates with 20 points per edge (resulting in 210 points per triangle and 1540 points per tetrahedron); cf. Section 4.1. We vectorize persistence diagrams (H0, H1 and H2) using [27] with (exponential) structure elements, parametrized as follows: locations are set to 64 k-means++ centers, computed from the (birth, death) tuples of all diagrams in the training data, scales are chosen as in ATOL [43, Eq. (2)], and the vectorization s stretch parameter is set to either the one, five, or ten percent lifetime quantile (based on the validation data); this yields 64-dim. vectorizations per diagram which, upon concatenation, are fed as 192-dim. feature vectors to an LGBM [31] classifier (except for corals, where we use ℓ1-regularized logistic regression). Hyperparameters are tuned on the validation data using FLAML [50] and a time budget of 10 minutes. We train all models on point clouds subsampled to 2k points by minimizing cross-entropy (or MSE for the regression task on rocks) over 200 epochs using Adam [33] with a cosine annealing schedule and batch size 64. We select the initial learning rate, weight decay, and early stopping period based on the validation data. On the real-world datasets, we use random scaling and shifts as data augmentation.