Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Estimating Hitting Times Locally at Scale

Authors: Themistoklis Haris, Fabian Spaeh, Spyridon Konstantinos Dragazis, Charalampos Tsourakakis

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we perform a comparative study of different algorithms for estimating the hitting time HG(u, v) between vertices u and v. Our experiments are conducted in Python 3.6 with Numba 0.53 on a 2.9 GHz Intel Xeon Gold 6226R processor with 384GB of RAM. We first compare the performance and error of each of those algorithms on synthetic graph datasets. We also run our algorithms on real world graphs, specifically the American Football Division IA games graph [Girvan and Newman, 2002] and the SNAP Facebook and Twitter graphs [Leskovec and Mcauley, 2012]. We find that Algorithm 1 consistently achieves low error with small variance.
Researcher Affiliation	Collaboration	Themistoklis Haris Department of Computer Science Boston University Fabian Spaeh Celonis Spyros Dragazis Department of Computer Science Boston University Charalampos Tsourakakis Department of Computer Science Boston University
Pseudocode	Yes	Our algorithm is given as pseudocode in Algorithm 1. Algorithm 1: Estimating the Hitting Time via Meeting Times. Algorithm 2: Estimating the Effective Resistance via Meeting Times. Algorithm 3: A Cutoff Algorithm for Estimating Hitting Times.
Open Source Code	Yes	1The code is available at https://github.com/tkhar/hitting-time-at-scale.
Open Datasets	Yes	We first compare the performance and error of each of those algorithms on synthetic graph datasets. We focus on three random graph models: Erd os-Rényi, where each pair of vertices forms an edge with probability p, Barabási-Albert, where a preferential attachment mechanism is used, and the Stochastic Block Model (SBM), where we model the emergence of community structures. We also run our algorithms on real world graphs, specifically the American Football Division IA games graph [Girvan and Newman, 2002] and the SNAP Facebook and Twitter graphs [Leskovec and Mcauley, 2012].
Dataset Splits	No	We estimate the hitting time from the first to the last node. To stress test our algorithms adequately, we randomly sample pairs u, v uniformly, but also proportionally and inversely proportional according to the product of degrees or Pagerank centralities Page et al. [1999].
Hardware Specification	Yes	Our experiments are conducted in Python 3.6 with Numba 0.53 on a 2.9 GHz Intel Xeon Gold 6226R processor with 384GB of RAM.
Software Dependencies	Yes	Our experiments are conducted in Python 3.6 with Numba 0.53 on a 2.9 GHz Intel Xeon Gold 6226R processor with 384GB of RAM.
Experiment Setup	Yes	Our algorithms are highly parallelizable because their local nature lends itself to multi-threaded computation. Through parallelization we obtain even better performance in computing the hitting time compared to the exact solver, as shown in Table 3. Here, we compute hitting times on the Facebook network using 10000 random walks. We present additional ablation studies evaluating runtime and relative estimation error as functions of the number of random walks, using the Football and Facebook networks. We also perform the same experiment for our synthetic datasets. In the Football network, our cutoff algorithm under-performs the others, likely due to an overestimated λ parameter highlighting its sensitivity to hyperparameter tuning. In contrast, Algorithm 1 appears more robust. We also visualize the correlation between hitting times and two pair sampling measures, illustrating how our method differs from simpler uniform sampling.