Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On the Completeness of Invariant Geometric Deep Learning Models

Authors: Zian Li, Xiyuan Wang, Shijia Kang, Muhan Zhang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct additional assessments to validate our theoretical claims. The first experiment aims to verify the conclusion that Dis GNN is nearly complete by assessing the proportion of its unidentifiable cases in real-world point clouds. The second experiment is designed to evaluate whether the complete models consistently demonstrate separation power for challenging pairs of point clouds, where numerical precision may influence the outcomes. We provide more evaluations of Geo NGNN on practical molecular-relevant tasks in Appendix F, which could offer further insights.
Researcher Affiliation	Academia	Zian Li1,2, Xiyuan Wang1,2, Shijia Kang1, Muhan Zhang1, 1Institute for Artificial Intelligence, Peking University 2School of Intelligence Science and Technology, Peking University Corresponding author: Muhan Zhang (EMAIL).
Pseudocode	No	The paper describes the model updates and aggregations using mathematical equations (e.g., Equation (1) for message passing) and textual descriptions, but does not present structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about the availability of its own source code, nor does it provide a link to a code repository for the methodology described.
Open Datasets	Yes	We select two representative datasets, namely QM9 (Ramakrishnan et al., 2014; Wu et al., 2018) and Model Net40 (Wu et al., 2015), for this assessment. ... For MD22, we adopt the data split specified in Chmiela et al. (2023), which is also consistent with the other works. ... Additionally, we include two other complete models: Dime Net (Gasteiger et al., 2019) and Gem Net (Gasteiger et al., 2021), as well as the recent advanced invariant model 2-F-Dis GNN (Li et al., 2024).
Dataset Splits	Yes	The data split (training/validation/testing) in r MD17 is 950/50/the rest, following related works such as Batatia et al. (2022); Gasteiger et al. (2021). For MD22, we adopt the data split specified in Chmiela et al. (2023), which is also consistent with the other works. Following Batatia et al. (2022), we split the training set and validation set at a ratio of 450/50, utilizing pre-split test sets for evaluation. Following Li et al. (2024), we split the training set and validation set at a ratio of 110K/10K, utilizing pre-split test sets for evaluation.
Hardware Specification	Yes	The models are trained on Nvidia RTX 4090 and Nvidia A100 (80GB) GPUs.
Software Dependencies	No	The paper mentions using 'Pytorch Geometric (Fey & Lenssen, 2019)' for Dime Net implementation, but it does not specify the version number of this library or any other software dependencies like Python or PyTorch versions used for their own models.
Experiment Setup	Yes	We set the number of VD-Conv layers to 7 for r MD17 and 6 for MD22. The hidden dimension is set to 512 and the dimension of radial basis functions (RBF) is set to 16. The message passing cutoff is set to 13Å for r MD17 and 7Å for MD22. We employ the polynomial envelope with p = 6, as proposed in Gasteiger et al. (2019), along with the corresponding cutoff. ... We optimize all models using the Adam optimizer (Kingma & Ba, 2014), incorporating exponential decay and plateau decay learning rate schedulers, as well as a linear learning rate warm-up. To mitigate the risk of overfitting, we employ early stopping based on validation loss and apply exponential moving average (EMA) with a decay rate of 0.99 to the model parameters during the validation and testing phases.