Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation
Authors: Han Huang, Leilei Sun, Bowen Du, Weifeng Lv
AAAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on diverse datasets validate the effectiveness of our framework. Particularly, the proposed method still generates high-quality molecular graphs in a limited number of steps. In this section, we display the experimental results of the proposed discrete graph structure assisted diffusion framework on multiple datasets. |
| Researcher Affiliation | Academia | Han Huang, Leilei Sun, Bowen Du*, Weifeng Lv State Key Laboratory of Software Development Environment, Beihang University, China EMAIL |
| Pseudocode | Yes | Algorithm 1: Optimizing CDGS Algorithm 2: Sampling from CDGS with the Euler Maruyama method |
| Open Source Code | Yes | We provide more experiment details in Appendix, and we release the code at https://github. com/GRAPH-0/CDGS. |
| Open Datasets | Yes | We train and evaluate models on two molecule datasets, ZINC250k (Irwin et al. 2012) and QM9 (Ramakrishnan et al. 2014). |
| Dataset Splits | No | We use 8 : 2 as the split ratio for train/test. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud instance types used for running experiments. |
| Software Dependencies | No | The paper mentions software like RDKit but does not provide specific version numbers for any key software components or libraries required for replication. |
| Experiment Setup | Yes | We pretrain the time-dependent predictor on perturbed graphs of the ZINC250k dataset for 200 epochs. Each initial molecular graph is encoded into latent codes at the middle time tΞΎ = 0.3 through the forward-time ODE solver. After 50 gradient ascent steps, all latent codes are decoded back to molecules with another gradient-guided reverse-time ODE solver. |