Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Inherent Limits on Topology-Based Link Prediction

Authors: Justus Isaiah Hibshman, Tim Weninger

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our main experiment is to calculate how maximum link prediction scores vary with the amount of information given to an idealized algorithm. We run this test on a wide variety of real-world graphs. The procedure runs as follows: [...] We tested the link prediction limits on a wide variety of real-world graphs. They are listed in Table 1. [...] We show the results for the four sparsest graphs [...] The results are in Figure 2.
Researcher Affiliation	Academia	Justus Isaiah Hibshman EMAIL Department of Computer Science and Engineering University of Notre Dame. Tim Weninger EMAIL Department of Computer Science and Engineering University of Notre Dame.
Pseudocode	No	The paper describes the methodology and formulae in prose and mathematical expressions (e.g., Section 5: Optimal Prediction Performance with ROC and AUPR formulae), but it does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format.
Open Source Code	Yes	We provide code both for proper AUPR calculation and for optimal ROC/AUPR scores at https://github.com/SteveWillowby/Link_Prediction_Limits.
Open Datasets	Yes	Table 1: Graphs used for tests [...] Sources: [1]: Rossi & Ahmed (2015), [2]: Kunegis (2013), [3]: Clauset et al. (2016), [4]: Leskovec & Krevl (2014), [5]: Yu et al. (2008), [6]: Myers (2003)
Dataset Splits	Yes	1. Begin with a graph G = (V , E) and an edge removal probability p (we set p 0.1). 2. Define the set of negatives N as all edges not in G. 3. Remove each edge in G with probability p (independently) and add the removed edges to the set of positives P. Call the resulting graph H (V , E \ P). [...] Ten percent of the graph’s edges were randomly selected as test edges.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU, GPU models, memory). There are no mentions of 'NVIDIA', 'Intel', 'AMD', 'TPU', or specific computer configurations.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers. While it refers to libraries like Nauty and Traces (Mc Kay & Piperno (2014)) or frameworks like Graph Neural Networks (Kipf & Welling (2016)), it does not list the versions of these or other key software components used in their experiments.
Experiment Setup	Yes	1. Begin with a graph G = (V , E) and an edge removal probability p (we set p 0.1). [...] We perform the above procedure multiple times for each graph. [...] If the performance limit just obtained from step 8 is equal to (or within 0.005 of) the performance limit obtained from step 5, then stop. Otherwise, assign k k + 1 and go to step 7.