Graph Generation with Diffusion Mixture

Authors: Jaehyeong Jo, Dongki Kim, Sung Ju Hwang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experimental validation on general graph and 2D/3D molecule generation tasks, we show that our method outperforms previous generative models, generating graphs with correct topology with both continuous (e.g. 3D coordinates) and discrete (e.g. atom types) features. and 4. Experiments We experimentally validate our method on diverse real-world graph generation tasks.
Researcher Affiliation Collaboration 1Korea Advanced Institute of Science and Technology (KAIST) 2Deep Auto.ai.
Pseudocode Yes Algorithm 1 Training, Algorithm 2 Sampling, Algorithm 3 Training of Gru M, Algorithm 4 Sampling of Gru M, Algorithm 5 PC Sampler for Gru M
Open Source Code Yes Our code is available at https://github.com/harryjo97/Gru M.
Open Datasets Yes Datasets and Metrics We evaluate the quality of generated graphs on three synthetic and real datasets used as benchmarks in previous works (Martinkus et al., 2022): Planar, Stochastic Block Model (SBM), and Proteins (Dobson & Doig, 2003). and We evaluate the quality of generated 2D molecules on two molecule datasets used as benchmarks in Jo et al. (2022): QM9 (Ramakrishnan et al., 2014) and ZINC250k (Irwin et al., 2012). and We evaluate the generated 3D molecules on two standard molecule datasets used as benchmarks in Hoogeboom et al. (2022): QM9 (Ramakrishnan et al., 2014) (up to 29 atoms) and GEOM-DRUGS (Axelrod & Gomez-Bombarelli, 2022) (up to 181 atoms).
Dataset Splits Yes We follow the evaluation setting of Martinkus et al. (2022) using the same data split. (Section 4.1) and Following the evaluation setting of Jo et al. (2022)... (Section 4.2) and Following Hoogeboom et al. (2022)... (Section 4.3).
Hardware Specification Yes For all experiments, we use NVIDIA Ge Force RTX 3090 and 2080 Ti and implement the source code with Py Torch (Paszke et al., 2019).
Software Dependencies No The paper mentions 'Py Torch' and 'RDKit library' but does not specify their version numbers, which is required for reproducible software dependencies.
Experiment Setup Yes For our proposed Gru M, we train our model for 30,000 epochs for all datasets using a constant learning rate with Adam W optimizer (Loshchilov & Hutter, 2017) and weight decay 10 12, applying the exponential moving average (EMA) to the parameters (Song & Ermon, 2020). We set the diffusion steps to 1000 for a fair comparison. and For our Gru M, we train our model sθ for 1,300 epochs with batch size 256 for the QM9 experiment, and for 13 epochs with batch size 64 for the GEOM-DRUGS experiment. We apply EMA to the parameters of the model with a coefficient of 0.999 and use Adam W optimizer with learning rate 10 4 and weight decay 10 12. The 3D coordinates and charge values are scaled as 4 and 0.1, respectively, and we use the simplified loss with a constant c = 100.