Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Aligning Transformers with Weisfeiler-Leman
Authors: Luis Müller, Christopher Morris
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our transformers on the large-scale PCQM4Mv2 dataset, showing competitive predictive performance with the stateof-the-art and demonstrating strong downstream performance when fine-tuning them on small-scale molecular datasets. |
| Researcher Affiliation | Academia | 1Department of Computer Science, RWTH Aachen University, Germany. Correspondence to: Luis Müller <EMAIL>. |
| Pseudocode | No | The paper describes algorithms mathematically (e.g., in Appendix D for Transformers and Appendix E for Weisfeiler Leman algorithms) but does not provide explicit pseudocode blocks labeled as 'Algorithm' or 'Pseudocode'. |
| Open Source Code | Yes | The source code for all experiments is available at https://github.com/luis-mueller/wl-transformers. |
| Open Datasets | Yes | For pre-training, we train on PCQM4MV2, one of the largest molecular regression datasets available (Hu et al., 2021). |
| Dataset Splits | No | The paper mentions using a validation set ('Validation MAE' in Table 2) and states, 'For model evaluation, we use the code provided by Hu et al. (2021), available at https://github.com/snap-stanford/ogb.' This implies standard splits are used, but specific percentages or sample counts for train/validation/test are not explicitly provided within the paper's text for all datasets. |
| Hardware Specification | Yes | For pre-training...on two A100 NVIDIA GPUs;...on a single A10 Nvidia GPU with 24GB RAM. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | Table 6: Hyper-parameters for (2, 1)-GT pre-training on PCQM4MV2. Parameter Value Learning rate 2e-4 Weight decay 0.1 Attention dropout 0.1 Post-attention dropout 0.1 Batch size 256 # gradient steps 2M # warmup steps 60K precision bfloat16 |