Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Hitting and Commute Times in Large Random Neighborhood Graphs

Authors: Ulrike von Luxburg, Agnes Radl, Matthias Hein

JMLR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we examine the convergence behavior of the commute distance in practice. We ran a large number of simulations, both on artiﬁcial and real world data sets, in order to evaluate whether the rescaled commute distance in a graph is close to its predicted limit expression 1/du + 1/dv.We conducted simulations for artiﬁcial graphs (random geometric graphs, planted partition graphs, preferential attachment graphs) and real world data sets of various types and sizes (social networks, biological networks, traﬃc networks; up to 1.6 million vertices and 22 million edges). The general setup is as follows. Given a graph, we compute the pairwise resistance distance Rij between all points. The relative deviation between resistance distance and predicted result is then given by Rel Dev(i, j) := \|Rij 1/di 1/dj\|. Note that we report relative rather than absolute deviations because this is more meaningful if Rij is strongly ﬂuctuating on the graph. It also allows to compare the behavior of diﬀerent graphs with each other. In all ﬁgures, we then report the maximum, mean and median relative deviations.
Researcher Affiliation	Academia	Ulrike von Luxburg EMAIL Department of Computer Science University of Hamburg Vogt-Koelln-Str. 30, 22527 Hamburg, Germany Agnes Radl EMAIL Institute of Mathematics University of Leipzig Augustusplatz 10, 04109 Leipzig, Germany Matthias Hein EMAIL Department of Mathematics and Computer Science Saarland University Campus E1 1, 66123 Saarbr ucken, Germany
Pseudocode	No	The paper describes methods and proofs using mathematical notation and natural language, such as in "Construction of the flow overview" (Section 6.2). However, it does not contain any clearly labeled pseudocode blocks or algorithms formatted like code.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	We start with the class of random geometric graphs, which is very important for machine learning. We use a mixture of two Gaussian distributions on Rd. The ﬁrst two dimensions contain a two-dimensional mixture of two Gaussians with varying separation (centers ( sep/2, 0) and (+sep/2, 0), covariance matrix 0.2 Id, mixing weights 0.5 for both Gaussians). The remaining d 2 dimensions contain Gaussian noise with variance 0.2 as well. From this distribution we draw n sample points. Based on this sample, we either compute the unweighted symmetric k NN graph, the unweighted ε-graphs or the Gaussian similarity graph. ... Now we consider the deviations for a couple of real world data sets. We start with similarity graphs on real data, as they are often used in machine learning. As example we use the full USPS data set of handwritten digits (9298 points in 256 dimensions) and consider the different forms of similarity graphs (k NN, ε, Gaussian) with varying connectivity parameter. ... Figure 5: Relative Error of the approximation in real world networks. Downloaded from: 1, http://deim.urv.cat/~aarenas/data/welcome.htm and 2, http://www.cise.ufl.edu/research/sparse/matrices/. See text for details.
Dataset Splits	No	The paper describes graph generation parameters for artificial datasets and sampling strategies for real-world datasets (e.g., "draw 20 vertices at random"), but it does not specify explicit training/validation/test splits, which are typically used for model evaluation in machine learning tasks.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the simulations or experiments.
Software Dependencies	No	The paper does not list any specific software dependencies or their version numbers that were used for the implementation or analysis.
Experiment Setup	Yes	We use a mixture of two Gaussian distributions on Rd. The ﬁrst two dimensions contain a two-dimensional mixture of two Gaussians with varying separation (centers ( sep/2, 0) and (+sep/2, 0), covariance matrix 0.2 Id, mixing weights 0.5 for both Gaussians). The remaining d 2 dimensions contain Gaussian noise with variance 0.2 as well. From this distribution we draw n sample points. Based on this sample, we either compute the unweighted symmetric k NN graph, the unweighted ε-graphs or the Gaussian similarity graph. In order to be able to compare the results between these three types of graphs we match the parameters of the diﬀerent graphs: given some value k for the k NN-graph we choose the values of ε for the ε-graph and σ for the Gaussian graph as the maximal k-nearest neighbor distance in the data set. ... Next we consider graphs according to the planted partition model. We modeled a graph with n vertices and two equal sized clusters with connectivity parameters pwithin and pbetween. ... In our simulation we generated preferential attachment graphs according to the following standard procedure: Starting with a graph that consists of two vertices connected by an edge, in each time step we add a new vertex to the graph. The new vertex is connected by a ﬁxed number of edges to existing vertices (this number is called Num Links in the ﬁgures below). The target vertices of these edges are chosen randomly among all existing vertices, where the probability to connect to a particular vertex is proportional to its degree.