Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Dynamic and Chemical Constraints to Enhance the Molecular Masked Graph Autoencoders

Authors: Jiahui Zhang, Wenjie Du, Yang Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We integrate the model-agnostic Dy CC into various MGAEs and conduct comprehensive experiments, demonstrating significant performance improvements. Our code is available at https://github. com/forever-ly/Dy CC.
Researcher Affiliation	Academia	1University of Science and Technology of China, China 2Suzhou Institute for Advanced Research, USTC, China EMAIL EMAIL
Pseudocode	Yes	Algorithm 1 The training process of GIBMS; Algorithm 2 Soft Label Generator
Open Source Code	Yes	Our code is available at https://github. com/forever-ly/Dy CC.
Open Datasets	Yes	For the pretraining stage, we utilized 2 million molecules sourced from the ZINC15 database [31], following the precedent of prior studies [17]. The GIBMS module was trained using the loss function defined in Eq. (13), where the temperature factor τ = 0.1 for the Info NCE loss, and β = 0.01 controls the trade-off between prediction and compression. After training the GIBMS module, we utilized it to generate corresponding mask probabilities for each atom of the 2 million molecules in ZINC15 and sampled masked atoms based on these probabilities. In the reconstruction phase, we mapped the hard labels outputted by the tokenizer to soft labels using the SLG module. By default, we set temptures τp = 0.25, τy = 0.1, and the number of prototypes n = 128. After pretraining, we employed the widely-adopted 8 binary classification datasets within Molecule Net [39] to evaluate performance on downstream molecular property prediction tasks (see Appendix B). These downstream datasets are divided into train/valid/test sets using scaffold split by 8:1:1 to facilitate an out-of-distribution evaluation setting. We report the mean performances (ROC-AUC) and standard deviations on the downstream datasets across ten random seeds.
Dataset Splits	Yes	These downstream datasets are divided into train/valid/test sets using scaffold split by 8:1:1 to facilitate an out-of-distribution evaluation setting. We apply the scaffold splitting for all tasks on all datasets. It splits the molecules with distinct two-dimensional structural frameworks into different subsets. It is a more challenging but practical setting since the test molecular can be structurally different from training set. Here we apply the scaffold splitting to construct the train/validation/test sets.
Hardware Specification	Yes	Our experiments are conducted using an NVIDIA DGX A100 server. Each experiment can be executed on a single GPU while staying within the limit of 30 GB of GPU memory consumption.
Software Dependencies	No	We used the official source code provided by Attr Mask, Mole Bert, and Sim SGT, retaining the exact same settings. Building upon this foundation, we introduced the GIBMS and SLG modules. The three additional hyperparameters for GIBMS were set to t = 1, β = 0.01, and τ = 0.1, respectively. The four additional hyperparameters for SLG were set to τy = 0.1, τp = 0.25, α = 1, and n = 128.
Experiment Setup	Yes	For the pretraining stage, we utilized 2 million molecules sourced from the ZINC15 database [31], following the precedent of prior studies [17]. The GIBMS module was trained using the loss function defined in Eq. (13), where the temperature factor τ = 0.1 for the Info NCE loss, and β = 0.01 controls the trade-off between prediction and compression. After training the GIBMS module, we utilized it to generate corresponding mask probabilities for each atom of the 2 million molecules in ZINC15 and sampled masked atoms based on these probabilities. In the reconstruction phase, we mapped the hard labels outputted by the tokenizer to soft labels using the SLG module. By default, we set temptures τp = 0.25, τy = 0.1, and the number of prototypes n = 128.