Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation

Authors: Han Huang, Leilei Sun, Bowen Du, Weifeng Lv

AAAI 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on diverse datasets validate the effectiveness of our framework. Particularly, the proposed method still generates high-quality molecular graphs in a limited number of steps. In this section, we display the experimental results of the proposed discrete graph structure assisted diffusion framework on multiple datasets.
Researcher Affiliation Academia Han Huang, Leilei Sun, Bowen Du*, Weifeng Lv State Key Laboratory of Software Development Environment, Beihang University, China EMAIL
Pseudocode Yes Algorithm 1: Optimizing CDGS Algorithm 2: Sampling from CDGS with the Euler Maruyama method
Open Source Code Yes We provide more experiment details in Appendix, and we release the code at https://github. com/GRAPH-0/CDGS.
Open Datasets Yes We train and evaluate models on two molecule datasets, ZINC250k (Irwin et al. 2012) and QM9 (Ramakrishnan et al. 2014).
Dataset Splits No We use 8 : 2 as the split ratio for train/test.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud instance types used for running experiments.
Software Dependencies No The paper mentions software like RDKit but does not provide specific version numbers for any key software components or libraries required for replication.
Experiment Setup Yes We pretrain the time-dependent predictor on perturbed graphs of the ZINC250k dataset for 200 epochs. Each initial molecular graph is encoded into latent codes at the middle time tΞΎ = 0.3 through the forward-time ODE solver. After 50 gradient ascent steps, all latent codes are decoded back to molecules with another gradient-guided reverse-time ODE solver.