Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Inherent Limits on Topology-Based Link Prediction
Authors: Justus Isaiah Hibshman, Tim Weninger
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our main experiment is to calculate how maximum link prediction scores vary with the amount of information given to an idealized algorithm. We run this test on a wide variety of real-world graphs. The procedure runs as follows: [...] We tested the link prediction limits on a wide variety of real-world graphs. They are listed in Table 1. [...] We show the results for the four sparsest graphs [...] The results are in Figure 2. |
| Researcher Affiliation | Academia | Justus Isaiah Hibshman EMAIL Department of Computer Science and Engineering University of Notre Dame. Tim Weninger EMAIL Department of Computer Science and Engineering University of Notre Dame. |
| Pseudocode | No | The paper describes the methodology and formulae in prose and mathematical expressions (e.g., Section 5: Optimal Prediction Performance with ROC and AUPR formulae), but it does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format. |
| Open Source Code | Yes | We provide code both for proper AUPR calculation and for optimal ROC/AUPR scores at https://github.com/SteveWillowby/Link_Prediction_Limits. |
| Open Datasets | Yes | Table 1: Graphs used for tests [...] Sources: [1]: Rossi & Ahmed (2015), [2]: Kunegis (2013), [3]: Clauset et al. (2016), [4]: Leskovec & Krevl (2014), [5]: Yu et al. (2008), [6]: Myers (2003) |
| Dataset Splits | Yes | 1. Begin with a graph G = (V , E) and an edge removal probability p (we set p 0.1). 2. Define the set of negatives N as all edges not in G. 3. Remove each edge in G with probability p (independently) and add the removed edges to the set of positives P. Call the resulting graph H (V , E \ P). [...] Ten percent of the graphβs edges were randomly selected as test edges. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU, GPU models, memory). There are no mentions of 'NVIDIA', 'Intel', 'AMD', 'TPU', or specific computer configurations. |
| Software Dependencies | No | The paper does not provide specific software dependency details with version numbers. While it refers to libraries like Nauty and Traces (Mc Kay & Piperno (2014)) or frameworks like Graph Neural Networks (Kipf & Welling (2016)), it does not list the versions of these or other key software components used in their experiments. |
| Experiment Setup | Yes | 1. Begin with a graph G = (V , E) and an edge removal probability p (we set p 0.1). [...] We perform the above procedure multiple times for each graph. [...] If the performance limit just obtained from step 8 is equal to (or within 0.005 of) the performance limit obtained from step 5, then stop. Otherwise, assign k k + 1 and go to step 7. |