Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Hamilton-Jacobi equations on graphs with applications to semi-supervised learning and data depth

Authors: Jeff Calder, Mahmood Ettehad

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We consider applications of the p-eikonal equation to data depth and semi-supervised learning, and use the continuum limit to prove asymptotic consistency results for both applications. Finally, we show the results of experiments with data depth and semi-supervised learning on real image datasets, including MNIST, Fashion MNIST and CIFAR-10, which show that the p-eikonal equation offers significantly better results compared to shortest path distances.
Researcher Affiliation	Academia	Jeff Calder EMAIL School of Mathematics University of Minnesota Mineapolis, MN 55455, USA Mahmood Ettehad EMAIL Institute for Mathematics and its Applications (IMA) University of Minnesota Mineapolis, MN 55455, USA
Pseudocode	No	The paper describes algorithms and methods in textual form, such as the solution of the p-eikonal equation via fast marching, but does not present them as structured pseudocode or algorithm blocks.
Open Source Code	Yes	Source Code: https://github.com/jwcalder/peikonal
Open Datasets	Yes	Finally, we show the results of experiments with data depth and semi-supervised learning on real image datasets, including MNIST, Fashion MNIST and CIFAR-10, which show that the p-eikonal equation offers significantly better results compared to shortest path distances. We consider the MNIST dataset of handwritten digits (Le Cun et al., 1998) and the Fashion MNIST dataset (Xiao et al., 2017), which is a drop-in replacement for MNIST consisting of 10 classes of clothing items. In addition to MNIST and Fashion MNIST, we also tested on CIFAR-10 (Krizhevsky et al., 2009).
Dataset Splits	No	Each dataset has 70,000 grayscale images of size 28x28 pixels. For both datasets we restricted the computations of data depth to each individual class, which consists of about 7000 datapoints per class. To speed up the computation of (26), we computed the minimum in (26) over 5% of the nodes in each class, chosen at random. We ran 100 trials at 1 label per class up to 5 labels per class, randomly choosing different labeled data for each trial. The paper describes the overall dataset sizes and how labeled data was selected for trials, but does not specify standard training/test/validation splits typically used for reproducibility.
Hardware Specification	No	The paper does not explicitly describe the hardware used for running its experiments. It mentions using the 'Graph Learning Python package' and 'C programming language' for implementation, but no details about CPUs, GPUs, or other computational resources are provided.
Software Dependencies	No	All code for the experiments is available online6 and uses the Graph Learning Python package (Calder, 2022). The paper mentions the 'Graph Learning Python package' by name, but does not specify a version number for it or any other key software dependencies.
Experiment Setup	Yes	For both datasets we restricted the computations of data depth to each individual class, which consists of about 7000 datapoints per class. We constructed the graph by connecting each image to its K-nearest neighbors with Gaussian weights given by (105) where xi represents the pixel values for image i, and dK(xi) is the distance between xi and its Kth nearest neighbor. We used K = 20 in all experiments. The weight matrix was then symmetrized by replacing W with WT. We computed the p-eikonal median via the definition (26) with p = 1 and α = 2. For the density estimator ρˆ we used a k-nearest neighbor density estimator with k = 30. To speed up the computation of (26), we computed the minimum in (26) over 5% of the nodes in each class, chosen at random. For MNIST and Fashion MNIST, we used variational autoencoders, similar to (Kingma and Welling, 2014), while for CIFAR-10 we used the Auto Encoding Transformations architecture from (Zhang et al., 2019). After training the autoencoders we built K-nearest neighbor graphs with weights given by (105) over the latent variables using the angular similarity with K = 20 neighbors. We again used a k-nearest neighbor density estimator with k = 30 neighbors for the reweighting. We ran 100 trials at 1 label per class up to 5 labels per class, randomly choosing different labeled data for each trial.