Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

DyG2Vec: Efficient Representation Learning for Dynamic Graphs

Authors: Mohammad Alomrani, Mahdi Biparva, Yingxue Zhang, Mark Coates

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on 7 benchmark datasets indicate that on average, our model outperforms So TA baselines on the future link prediction task by 4.23% for the transductive setting and 3.30% for the inductive setting while only requiring 5-10x less training/inference time. Lastly, different aspects of the proposed framework are investigated through experimental analysis and ablation studies.
Researcher Affiliation	Collaboration	Mohammad Ali Alomrani EMAIL Huawei Noah s Ark Lab Mahdi Biparva EMAIL Huawei Noah s Ark Lab Yingxue Zhang EMAIL Huawei Noah s Ark Lab Mark Coates EMAIL Mc Gill University
Pseudocode	No	The paper describes the methodology in text and uses figures (Figure 1 and Figure 2) to illustrate architectures and frameworks, but it does not include a dedicated pseudocode or algorithm block.
Open Source Code	Yes	The code is publicly available at https://github.com/huawei-noah/noah-research/tree/master/graph_atlas.
Open Datasets	Yes	The code and datasets are publicly available at https://github. com/huawei-noah/noah-research/tree/master/graph_atlas. We use 7 real-world datasets: Wikipedia, Reddit, MOOC, and Last FM (Kumar et al., 2019); Social Evolution, Enron, and UCI (Wang et al., 2021b).
Dataset Splits	Yes	We perform the same 70%-15%-15% chronological split for all datasets as in Wang et al. (2021b).
Hardware Specification	Yes	All experiments were done on a Ubuntu 20.4 server with 72 Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz cores and a RAM of size 755 Gb. We use a NVIDIA Tesla V100-PCIE-32GB GPU.
Software Dependencies	No	The paper mentions using the Pytorch framework (Paszke et al., 2019) and Pytorch Geometric (Fey & Lenssen, 2019) but does not provide specific version numbers for these software dependencies. It also mentions Ubuntu 20.4 as the operating system.
Experiment Setup	Yes	We use a constant learning rate of 0.0001 for all datasets and tasks. Dy G2Vec is trained for 100 epochs for both downstream and SSL pre-training. For downstream training, We use a constant window size of 64K for all datasets except for MOOC, Social Evolve, and Enron where we found a smaller window size of 8K works best. The batch size is set to 200 target edges. During SSL pre-training, we use a constant window size of 32K with stride 200. Following previous work (Rossi et al., 2020; Xu et al., 2020), all dynamic node classification training experiments are performed with L2-decay parameter λ = 0.00001 to alleviate over-fitting. Both distortions are applied with dropout probability pd = 0.3.