Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Latent 3D Graph Diffusion

Authors: Yuning You, Ruida Zhou, Jiwoong Park, Haotian Xu, Chao Tian, Zhangyang Wang, Yang Shen

ICLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate through comprehensive experiments that our method generates 3D molecules of higher validity / drug-likeliness and comparable or better conformations / energetics, while being an order of magnitude faster in training.
Researcher Affiliation Academia Yuning You1, Ruida Zhou2, Jiwoong Park1, Haotian Xu1, Chao Tian1, Zhangyang Wang3, Yang Shen1 1Texas A&M University, 2University of California, Los Angeles, 3University of Texas at Austin
Pseudocode Yes Please refer to Algs. 1 & 2 for details.
Open Source Code Yes Codes are released at https://github.com/Shen-Lab/LDM-3DG.
Open Datasets Yes We pretrain our topological and geometric AEs on the large-scale public databases as ChEMBL (Gaulton et al., 2012) and PubChem QC (Nakata & Shimazaki, 2017), respectively, which can be repetitively utilized in almost all later experiments.
Dataset Splits Yes We initially split the QM9 training set into two halves, each containing 50,000 samples.
Hardware Specification No The paper mentions 'advanced computing resources provided by Texas A&M High Performance Research Computing' but does not specify any particular hardware components like CPU or GPU models, or memory specifications.
Software Dependencies No The paper mentions tools like RDKit (Landrum, 2013) and AutoDock Vina (Huey et al., 2012) by citing relevant papers, but it does not specify concrete version numbers for these or any other software dependencies crucial for reproduction.
Experiment Setup No The paper mentions pretraining AEs for '2 days' and the use of 'teacher forcing' but lacks specific hyperparameter values such as learning rate, batch size, or optimizer settings, which are typically required for reproducible experimental setups.