Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Accelerating Molecular Graph Neural Networks via Knowledge Distillation
Authors: Filip Ekström Kelvinius, Dimitar Georgiev, Artur Toshev, Johannes Gasteiger
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate our proposed methods, we perform comprehensive benchmarking experiments on the OC20-2M [17] dataset (structure to energy and forces (S2EF) task) a large and diverse catalyst dataset; and COLL [6] a challenging molecular dynamics dataset. |
| Researcher Affiliation | Collaboration | Filip Ekström Kelvinius Linköping University EMAIL Dimitar Georgiev Imperial College London EMAIL Artur Petrov Toshev Technical University of Munich EMAIL Johannes Gasteiger Google Research EMAIL |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. Methods are described in prose and mathematical equations. |
| Open Source Code | Yes | Associated code is available online2. 2https://github.com/gasteigerjo/ocp/blob/main/DISTILL.md |
| Open Datasets | Yes | To evaluate our proposed methods, we perform comprehensive benchmarking experiments on the OC20-2M [17] dataset (structure to energy and forces (S2EF) task) a large and diverse catalyst dataset; and COLL [6] a challenging molecular dynamics dataset. |
| Dataset Splits | Yes | Values represent the average across the four available validation sets. Results for individual validation datasets are provided in Appendix B. |
| Hardware Specification | Yes | Models were trained on NVIDIA A100 40 GB and NVIDIA RTX A6000 48 GB GPUs, except Gem Net-OC-small which were trained on NVIDIA A100 80 GB and NVIDIA RTX A6000 48 GB. All models were trained on single GPUs, except for Sch Net when trained on OC20-2M, which required 3 GPUs. Inference throughput was profiled on A100 40 GB GPUs, with reported values representing approximate numbers averaged across three evaluations. |
| Software Dependencies | No | The paper mentions using the 'Open Catalyst Project (OCP) codebase' but does not specify version numbers for this or any other software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | We provide detailed information about the hyperparameters we used for each model in Tables 5, 6, and 7. Moreover, we summarize the KD weighting factors λ we used for each model configuration in Table 8. |