Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

DyG2Vec: Efficient Representation Learning for Dynamic Graphs

Authors: Mohammad Alomrani, Mahdi Biparva, Yingxue Zhang, Mark Coates

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on 7 benchmark datasets indicate that on average, our model outperforms So TA baselines on the future link prediction task by 4.23% for the transductive setting and 3.30% for the inductive setting while only requiring 5-10x less training/inference time. Lastly, different aspects of the proposed framework are investigated through experimental analysis and ablation studies.
Researcher Affiliation Collaboration Mohammad Ali Alomrani EMAIL Huawei Noah s Ark Lab Mahdi Biparva EMAIL Huawei Noah s Ark Lab Yingxue Zhang EMAIL Huawei Noah s Ark Lab Mark Coates EMAIL Mc Gill University
Pseudocode No The paper describes the methodology in text and uses figures (Figure 1 and Figure 2) to illustrate architectures and frameworks, but it does not include a dedicated pseudocode or algorithm block.
Open Source Code Yes The code is publicly available at https://github.com/huawei-noah/noah-research/tree/master/graph_atlas.
Open Datasets Yes The code and datasets are publicly available at https://github. com/huawei-noah/noah-research/tree/master/graph_atlas. We use 7 real-world datasets: Wikipedia, Reddit, MOOC, and Last FM (Kumar et al., 2019); Social Evolution, Enron, and UCI (Wang et al., 2021b).
Dataset Splits Yes We perform the same 70%-15%-15% chronological split for all datasets as in Wang et al. (2021b).
Hardware Specification Yes All experiments were done on a Ubuntu 20.4 server with 72 Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz cores and a RAM of size 755 Gb. We use a NVIDIA Tesla V100-PCIE-32GB GPU.
Software Dependencies No The paper mentions using the Pytorch framework (Paszke et al., 2019) and Pytorch Geometric (Fey & Lenssen, 2019) but does not provide specific version numbers for these software dependencies. It also mentions Ubuntu 20.4 as the operating system.
Experiment Setup Yes We use a constant learning rate of 0.0001 for all datasets and tasks. Dy G2Vec is trained for 100 epochs for both downstream and SSL pre-training. For downstream training, We use a constant window size of 64K for all datasets except for MOOC, Social Evolve, and Enron where we found a smaller window size of 8K works best. The batch size is set to 200 target edges. During SSL pre-training, we use a constant window size of 32K with stride 200. Following previous work (Rossi et al., 2020; Xu et al., 2020), all dynamic node classification training experiments are performed with L2-decay parameter λ = 0.00001 to alleviate over-fitting. Both distortions are applied with dropout probability pd = 0.3.