reproducibilityindex.ai

Spanning Tree-based Graph Generation for Molecules

Authors: Sungsoo Ahn, Binghong Chen, Tianzhe Wang, Le Song

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on QM9, ZINC250k, and MOSES benchmarks verify the effectiveness of the proposed framework in metrics such as validity, Fr echet Chem Net distance, and fragment similarity. We also demonstrate the usefulness of STGG in maximizing penalized Log P value of molecules.
Researcher Affiliation	Collaboration	1POSTECH, 2Georgia Institute of Technology, 3Biomap, 4MBZUAI sungsoo.ahn@postech.ac.kr, {binghong, tianzhe}@gatech.edu, dasongle@gmail.com
Pseudocode	Yes	We provide the full algorithm in Algorithm 1.
Open Source Code	Yes	We submit the full implementation of our STGG framework and the baselines used in our experiments as a supplementary material.
Open Datasets	Yes	We experiment on popular graph generation benchmarks of QM9, ZINC250K, and MOSES to validate the effectiveness of our algorithm.
Dataset Splits	No	We train our generative model on the respective datasets and sample 10,000 molecules to measure (a) the ratio of valid molecules (VALID), (b) the ratio of unique molecules (UNIQUE), and (c) the ratio of novel molecules with respect to the training dataset (NOVEL). This describes metrics evaluated on generated samples, not the dataset splits for training/validation. The paper also mentions "The similarity metrics of FCD, SNN, FRAG, SCAF are measured with respect to the test dataset of molecules". While this refers to a test set, it doesn't specify the full train/validation/test splits, or how the validation set was used for model tuning.
Hardware Specification	Yes	Using a single Quadro RTX 6000 GPU, it takes approximately three, ten, and 96 hours to fully train the models on QM9, ZINC250K, and MOSES datasets, respectively.
Software Dependencies	No	The paper mentions "Adam W (Loshchilov & Hutter, 2019) optimizer" and refers to "Transformer-related conﬁgurations" from "Vaswani et al., 2017", but it does not specify software versions for programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	For all the experiments, we train the Transformer under STGG framework for 100 epochs with batch size of 128 for all the dataset. We use the Adam W (Loshchilov & Hutter, 2019) optimizer with constant learning rate of 10 4. We use three and six Transformer layers for {QM9, ZINC250K} and MOSES, respectively. The rest of Transformer-related conﬁgurations follow that of the original work (Vaswani et al., 2017); we use the attention module with embedding size of 1024 with eight heads, MLP with dimension of 2048, and dropout with probability of 0.1.