Supra-Laplacian Encoding for Transformer on Dynamic Graphs
Authors: Yannis Karmim, Marc Lafon, Raphael Fournier-S'niehotta, Nicolas THOME
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | SLATE outperforms numerous state-of-the-art methods based on Message-Passing Graph Neural Networks combined with recurrent models (e.g. , LSTM), and Dynamic Graph Transformers, on 9 datasets. Code is open-source and available at this link https://github.com/ykrmm/SLATE. We conduct an extensive experimental validation of our method across 11 real and synthetic discrete-time dynamic graph datasets. SLATE outperforms state-of-the-art results by a large margin. |
| Researcher Affiliation | Academia | Yannis Karmim Conservatoire National des Arts et Métiers CEDRIC, EA 4629 F 75003, Paris, France yannis.karmim@cnam.fr Marc Lafon Conservatoire National des Arts et Métiers CEDRIC, EA 4629 F 75003, Paris, France marc.lafon@lecnam.net Raphaël Fournier S niehotta Conservatoire National des Arts et Métiers CEDRIC, EA 4629 F 75003, Paris, France fournier@cnam.fr Nicolas Thome Sorbonne Université CNRS, ISIR F-75005 Paris, France nicolas.thome@isir.upmc.fr |
| Pseudocode | Yes | Algorithm 1: Computation of supra-laplacian spectrum |
| Open Source Code | Yes | Code is open-source and available at this link https://github.com/ykrmm/SLATE. |
| Open Datasets | Yes | Datasets. In Table 6 Appendix C, we provide detailed statistics for the datasets used in our experiments. An in-depth description of the datasets is given in Appendix C. We evaluate on DTDGs datasets provided by [60] and [55], we add a synthetic dataset SBM based on stochastic block model [29], to evaluate on denser DTDG. Table 6 lists: Can Parl, USLegis, Flights, Trade, UNVote, Contact, Hep Ph, AS733, Enron, Colab, SBM. |
| Dataset Splits | Yes | For the datasets from [60], we follow the same graph splitting strategy, which means 70% of the snapshots for training, 15% for validation, and 15% for testing. |
| Hardware Specification | Yes | We trained on an NVIDIA-Quadro RTX A6000 with 49 GB of total memory. |
| Software Dependencies | No | The paper mentions using a 'transformer Encoder Layer' [45], 'Flash Attention' [9], and 'Performer' [5], but does not provide specific version numbers for these or other software dependencies like Python or PyTorch. |
| Experiment Setup | Yes | Implementation details. We use one transformer Encoder Layer [45]. We fix the token dimension at d = 128 and the time window at w = 3 for all our experiments. We use an SGD optimizer for all of our experiments. Further details on hyper-parameters search, including the number of eigenvectors for our spatio-temporal encoding, are in Appendix D. Table 8: Hyperparameter search range. k [4,6,10,12,14], nhead_xa [1,2,4,8], nhead_encoder [1,2,4,8], dim_ffn [128,512,1024], norm_first [True,False], learning_rate [0.1,0.01,0.001,0.0001], weight_decay [0,5e-7]. |