Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

ContextGNN: Beyond Two-Tower Recommendation Systems

Authors: Yiwen Yuan, Zecheng Zhang, Xinwei He, Akihiro Nitta, Weihua Hu, Manan Shah, Blaz Stojanovic, Shenyang(Andy) Huang, Jan E Lenssen, Jure Leskovec, Matthias Fey

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that CONTEXTGNN is able to adapt to different data characteristics and outperforms existing methods, both traditional and GNN-based, on a diverse set of practical recommendation tasks, improving performance by 20% on average.
Researcher Affiliation	Industry	Yiwen Yuan, Zecheng Zhang, Xinwei He, Akihiro Nitta, Weihua Hu, Manan Shah, Blaˇz Stojanoviˇc, Shenyang Huang, Jan Eric Lenssen, Jure Leskovec, Matthias Fey Kumo.AI
Pseudocode	No	The paper describes the methods through textual explanations, mathematical formulas (e.g., Equation 1 and 2), and a high-level architectural diagram (Figure 1), but it does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our method1 is implemented in PYTORCH (Paszke et al., 2019) utilizing the PYTORCH GEOMETRIC (Fey & Lenssen, 2019) and PYTORCH FRAME (Hu et al., 2024) libraries. 1Git Hub: https://github.com/kumo-ai/Context GNN
Open Datasets	Yes	We utilize the recommendation tasks introduced in RELBENCH (Robinson et al., 2024), which consists of eight different realistic and temporal-aware recommendation tasks. We evaluate CONTEXTGNN on the static link prediction task of Amazon-Book (Wang et al., 2019). We use CONTEXTGNN to perform temporal next-item recommendation on the IJCAI Contest dataset (Xia et al., 2022).
Dataset Splits	Yes	Table 1: The locality score for different subgraph depths k {1, 3} on validation/test splits for all recommendation tasks in RELBENCH We evaluate CONTEXTGNN on the static link prediction task of Amazon-Book (Wang et al., 2019), which...evaluates on 10% of randomly selected interactions independent of time
Hardware Specification	Yes	In practice, we have no issues to scale the number of classes C to 1M on commodity GPUs (15GB of memory)
Software Dependencies	No	Our method1 is implemented in PYTORCH (Paszke et al., 2019) utilizing the PYTORCH GEOMETRIC (Fey & Lenssen, 2019) and PYTORCH FRAME (Hu et al., 2024) libraries. While these libraries are mentioned with their respective publication years, specific version numbers (e.g., PyTorch 1.x) are not provided in the text.
Experiment Setup	Yes	The hyperparameters we tune for each task are: (1) the number of hidden units {32, 64, 128, 256, 512}, (2), the batch size {256, 512, 1024}, and (3) the learning rate {0.001, 0.01}.