reproducibilityindex.ai

GeoDiff: A Geometric Diffusion Model for Molecular Conformation Generation

Authors: Minkai Xu, Lantao Yu, Yang Song, Chence Shi, Stefano Ermon, Jian Tang

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive experiments on multiple benchmarks, including conformation generation and property prediction tasks. Numerical results show that GEODIFF consistently outperforms existing state-of-the-art machine learning approaches, and by a large margin on the more challenging large molecules.
Researcher Affiliation	Academia	Minkai Xu1,2, Lantao Yu3, Yang Song3, Chence Shi1,2, Stefano Ermon3 , Jian Tang1,4,5 1Mila Québec AI Institute, Canada 2Université de Montréal, Canada 3Stanford University, USA 4HEC Montréal, Canada 5CIFAR AI Research Chair
Pseudocode	Yes	Algorithm 1 Sampling Algorithm of GEODIFF. Input: the molecular graph G, the learned reverse model ϵθ. Output: the molecular conformation C. 1: Sample CT p(CT ) = N(0, I) 2: for s = T, T 1, , 1 do 3: Shift Cs to zero Co M 4: Compute µθ(Cs, G, s) from ϵθ(Cs, G, s) using equation 4 5: Sample Cs 1 N(Cs 1; µθ(Cs, G, s), σ2 t I) 6: end for 7: return C0 as C
Open Source Code	Yes	Code is available at https://github.com/Minkai Xu/Geo Diff.
Open Datasets	Yes	Following prior works (Xu et al., 2021a;b), we also use the recent GEOM-QM9 (Ramakrishnan et al., 2014) and GEOM-Drugs (Axelrod & Gomez-Bombarelli, 2020) datasets.
Dataset Splits	Yes	For both datasets, the training split consists of 40, 000 molecules with 5 conformations for each, resulting in 200, 000 conformations in total. The valid split share the same size as training split. The test split contains 200 distinct molecules, with 22, 408 conformations for QM9 and 14, 324 ones for Drugs.
Hardware Specification	Yes	For the training of GEODIFF, we train the model on a single Tesla V100 GPU with a learning rate of 0.001 until convergence and Adam (Kingma & Welling, 2013) as the optimizer.
Software Dependencies	No	The paper mentions using MPNNs but does not provide specific version numbers for any software libraries, frameworks, or dependencies used in the experiments (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	The other hyper-parameters of GEODIFF are summarized in Tab. 4, including highest variance level βT , lowest variance level βT , the variance schedule, number of diffusion timesteps T, radius threshold for determining the neighbor of atoms τ, batch size, and number of training iterations.