Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Straight-Line Diffusion Model for Efficient 3D Molecular Generation
Authors: Yuyan Ni, Shikun Feng, Haohan Chi, Bowen Zheng, Huan-ang Gao, Wei-Ying Ma, Zhi-Ming Ma, Yanyan Lan
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to demonstrate the potential of straight-line diffusion in 3D molecular generation and other domains. As shown in Figure 1, using only at most 10 or 15 sampling steps, SLDM surpasses EDM or Equi FM, Geo BFN with 1000 sampling steps, achieving up to 100or 70-fold improvement in sampling efficiency. To validate the advantages of our method in molecular generation, we evaluate its overall performance and sampling efficiency in both unconditional and conditional generation scenarios. |
| Researcher Affiliation | Academia | 1 Academy of Mathematics and Systems Science, Chinese Academy of Sciences 2 Zhongguancun Institute of Artificial Intelligence, China 3 Institute for AI Industry Research (AIR), Tsinghua University 4 University of Chinese Academy of Sciences 5Huazhong University of Science and Technology 6Beijing Frontier Research Center for Biological Structure, Tsinghua University 7Beijing Academy of Artificial Intelligence Corresponding author: Yanyan Lan (EMAIL). |
| Pseudocode | Yes | The complete training and sampling procedure of straight-line diffusion are given in algorithm 1 and 2. The SLDM algorithms tailored for molecular generation are provided in Appendix B. |
| Open Source Code | Yes | 1The code is open-sourced at https://github.com/fengshikun/SLDM |
| Open Datasets | Yes | We evaluate our model using two widely adopted datasets for unconditional molecular generation, with all dataset splitting strictly following baseline settings [Hoogeboom et al., 2022, Song et al., 2024, 2023a]. QM9 [Ruddigkeit et al., 2012, Ramakrishnan et al., 2014] contains approximately 134,000 small organic molecules... GEOM-Drugs [Axelrod and Gomez-Bombarelli, 2022] focuses on drug-like molecules... |
| Dataset Splits | Yes | QM9 [Ruddigkeit et al., 2012, Ramakrishnan et al., 2014] contains approximately 134,000 small organic molecules with up to nine heavy atoms. It is split into training (100K), validation (18K), and test (13K) sets. GEOM-Drugs [Axelrod and Gomez-Bombarelli, 2022]... The dataset is randomly divided into training, validation, and test sets using an 8:1:1 ratio. |
| Hardware Specification | Yes | For QM9, it takes approximately 10 days on a single A100 GPU. For GEOM-drugs, it takes approximately 16 days on four A100 GPUs. |
| Software Dependencies | No | Optimizer Adam |
| Experiment Setup | Yes | The hyperparameter settings for molecular generation are detailed in Table 7. Settings follow Uni GEM [Feng et al., 2024], with two additional tunable hyperparameters introduced by our generative algorithm: the noise variance σ and the temperature annealing rate ν. Table 7: Network and training hyperparameters. Embedding size 256 for unconditional generation, 192 for conditional generation Layer number 9 for QM9, 4 for Geom-Drugs Shared layers 1 Batch size 64 for QM9, 128 for Geom-Drugs Train epoch 3000 for QM9, 32 for Geom-Drugs Learning rate 1.00 10 4 Optimizer Adam Sample steps T 10 1000 Nucleation time 10 Oversampling ratio 0.5 for each branch Loss weight 1 for each loss term Noise Variance σ 0.05 for unconditional generation, 0.1 for conditional generation Temperature Annealing Rate ν 0.5 for unconditional generation, 3 for conditional generation Non-uniform Discretization False if T > 13 |