Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Chemistry-Inspired Diffusion with Non-Differentiable Guidance
Authors: Yuchen Shen, Chenhao Zhang, Sijie Fu, Chenghui Zhou, Newell Washburn, Barnabás Póczos
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that our method: (1) significantly reduces atomic forces, enhancing the validity of generated molecules when used for stability optimization; (2) is compatible with both explicit and implicit guidance in diffusion models, enabling joint optimization of molecular properties and stability; and (3) generalizes effectively to molecular optimization tasks beyond stability optimization. (Abstract) |
| Researcher Affiliation | Academia | Yuchen Shen , Chenhao Zhang , Sijie Fu , Chenghui Zhou, Newell Washburn , Barnab as P oczos Carnegie Mellon University Pittsburgh, PA 15213, USA EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1 Bilevel guided diffusion sampling with noisy neural guidance... Algorithm 2 CHEMGUIDE diffusion sampling with evolutionary algorithm... Algorithm 3 Bilevel guided diffusion sampling with clean neural guidance |
| Open Source Code | Yes | https://github.com/A-Chicharito-S/Chem Guide (Abstract) ... Our implementation is available at https://github.com/A-Chicharito-S/Chem Guide. (Section B) |
| Open Datasets | Yes | The models in our experiment are trained on the QM9 dataset (Ramakrishnan et al.) and the GEOM dataset (Axelrod & G omez-Bombarelli). |
| Dataset Splits | No | The paper mentions training on QM9 and GEOM datasets and sampling molecules for evaluation (e.g., "We sample 500 molecules from QM9"), but it does not specify explicit training/validation/test splits, percentages, or methodology used for partitioning the datasets. |
| Hardware Specification | Yes | Hardware & Time We use a 48 Gi B A6000 GPU with AMD EPYC 7513 32-Core Processors for our experiments. (Section B) |
| Software Dependencies | No | The paper mentions using specific models like EDM and Geo LDM, and a method like GFN2-x TB, and an external tool Gaussian16, but it does not provide specific version numbers for programming languages or libraries (e.g., Python 3.x, PyTorch 1.x) that were part of their implementation. |
| Experiment Setup | Yes | We choose s (Eq. 6) from {1, 10 1, 10 2, 10 3, 10 4} for all experiments, and additionally {2, 5, 10, 20, 25, 30, 40, 50} for the 6 properties. ... We add guidance to the last 400 of the 1000 diffusion steps (Han et al., 2024) (Section 4.1) |