Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Inherent Limits on Topology-Based Link Prediction

Authors: Justus Isaiah Hibshman, Tim Weninger

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our main experiment is to calculate how maximum link prediction scores vary with the amount of information given to an idealized algorithm. We run this test on a wide variety of real-world graphs. The procedure runs as follows: [...] We tested the link prediction limits on a wide variety of real-world graphs. They are listed in Table 1. [...] We show the results for the four sparsest graphs [...] The results are in Figure 2.
Researcher Affiliation Academia Justus Isaiah Hibshman EMAIL Department of Computer Science and Engineering University of Notre Dame. Tim Weninger EMAIL Department of Computer Science and Engineering University of Notre Dame.
Pseudocode No The paper describes the methodology and formulae in prose and mathematical expressions (e.g., Section 5: Optimal Prediction Performance with ROC and AUPR formulae), but it does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format.
Open Source Code Yes We provide code both for proper AUPR calculation and for optimal ROC/AUPR scores at https://github.com/SteveWillowby/Link_Prediction_Limits.
Open Datasets Yes Table 1: Graphs used for tests [...] Sources: [1]: Rossi & Ahmed (2015), [2]: Kunegis (2013), [3]: Clauset et al. (2016), [4]: Leskovec & Krevl (2014), [5]: Yu et al. (2008), [6]: Myers (2003)
Dataset Splits Yes 1. Begin with a graph G = (V , E) and an edge removal probability p (we set p 0.1). 2. Define the set of negatives N as all edges not in G. 3. Remove each edge in G with probability p (independently) and add the removed edges to the set of positives P. Call the resulting graph H (V , E \ P). [...] Ten percent of the graph’s edges were randomly selected as test edges.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU, GPU models, memory). There are no mentions of 'NVIDIA', 'Intel', 'AMD', 'TPU', or specific computer configurations.
Software Dependencies No The paper does not provide specific software dependency details with version numbers. While it refers to libraries like Nauty and Traces (Mc Kay & Piperno (2014)) or frameworks like Graph Neural Networks (Kipf & Welling (2016)), it does not list the versions of these or other key software components used in their experiments.
Experiment Setup Yes 1. Begin with a graph G = (V , E) and an edge removal probability p (we set p 0.1). [...] We perform the above procedure multiple times for each graph. [...] If the performance limit just obtained from step 8 is equal to (or within 0.005 of) the performance limit obtained from step 5, then stop. Otherwise, assign k k + 1 and go to step 7.