Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Fermat Distances: Metric Approximation, Spectral Convergence, and Clustering Algorithms
Authors: Nicolás García Trillos, Anna Little, Daniel McKenzie, James M. Murphy
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical analysis is supported with numerical simulations and experiments on synthetic and real image data. |
| Researcher Affiliation | Academia | Nicolas Garc ıa Trillos EMAIL Department of Statistics University of Wisconsin Madison, WI 53706, USA Anna Little EMAIL Department of Mathematics, Utah Center for Data Science University of Utah Salt Lake City, UT 84112, USA Daniel Mc Kenzie EMAIL Department of Applied Mathematics and Statistics Colorado School of Mines Golden, CO 80401, USA James M. Murphy EMAIL Department of Mathematics Tufts University Medford, MA 02155, USA |
| Pseudocode | Yes | Algorithm 1 Fermat Distance Spectral Clustering 1: Input: data points x1, . . . , xn, density parameter p 1, normalization parameter s 0, embedding dimension r, kernel scale h 2: Output: Laplacian Lp,s, FD-SC spectral embedding [v1, . . . , vr] Rn r 4: Wp(xi, xk) η(ℓp(xi, xk)/h) Compute weights k Wp(xi, xk) Compute normalization factor 6: Wp,s(xi, xk) Wp(xi, xk) 2 )dp(xk)(1 s 2 ) Compute normalized weight matrix 8: Dp,s(xi, xi) k Wp,s(xi, xk) Compute degree matrix 9: Lp,s D 1 p,s (Dp,s Wp,s) Compute random walk Laplacian 10: [v1, . . . , vr] bottom r eigenvectors of Lp,s |
| Open Source Code | Yes | Code to reproduce numerical results is available at https://github.com/JamesMMurphy11/FermatDistances. |
| Open Datasets | No | The paper mentions "synthetic and real image data" and provides images from Unsplash, but does not provide concrete access information (e.g., specific dataset names with citations, links to feature datasets) for publicly available research datasets used in experiments. The other datasets mentioned are synthetic and generated by the authors. |
| Dataset Splits | No | The paper mentions datasets but does not provide specific details on how they were split into training, validation, or test sets for reproducibility. |
| Hardware Specification | No | The paper mentions runtimes for the algorithms in the captions of figures 5-12, e.g., "Runtime for Fermat Laplacian: 168.46 10.70s.", but it does not specify any hardware details like CPU or GPU models used for these experiments. |
| Software Dependencies | No | The paper describes algorithms but does not provide specific software names with version numbers, nor any programming languages with library versions used for implementation. |
| Experiment Setup | Yes | Algorithm 1 and Algorithm 2 list input parameters such as "density parameter p", "normalization parameter s", "embedding dimension r", "kernel scale h", "parameters q, j". Section 6.1 discusses varying 'p' and 's', and section 6.2 discusses 'p' and 'τ' values (e.g., "p = 1.2, τ = .25"). |