Score-based 3D molecule generation with neural fields
Authors: Matthieu Kirchmeyer, Pedro O. O. Pinheiro, Saeed Saremi
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We introduce a new model based on walk-jump sampling [1] for unconditional 3D molecule generation in the continuous space using neural fields. ... Our method achieves competitive results on drug-like molecules and easily scales to macro-cyclic peptides, with at least one order of magnitude faster sampling. ... We compare Func Mol and Func Moldec to three state-of-the-art approaches. ... We evaluate Func Mol on three datasets: QM9 [81], GEOM-drugs [82] and CREMP [12]. |
| Researcher Affiliation | Industry | Matthieu Kirchmeyer*, Pedro O. Pinheiro*, Saeed Saremi Prescient Design, Genentech |
| Pseudocode | Yes | Algorithm 1: Auto-decoding conditional neural field training pseudo-code Equation (2) ... Algorithm 2: Auto-encoding conditional neural field training pseudo-code Equation (3) ... Algorithm 3: Denoiser training pseudo-code Equation (6) ... Algorithm 4: Sampling pseudo-code |
| Open Source Code | Yes | The code is available at https://github.com/prescient-design/funcmol. |
| Open Datasets | Yes | We evaluate Func Mol on three datasets: QM9 [81], GEOM-drugs [82] and CREMP [12]. |
| Dataset Splits | Yes | We use a split of 100K/20K/13K molecules for QM9, 1.1M/146K/146K on GEOM-drugs and 409K/10K/9K on CREMP for train, validation and test, respectively. |
| Hardware Specification | Yes | on 4 A100 GPUs |
| Software Dependencies | No | The paper mentions software like RDKit, Open Babel, MFN, and Vox Mol's implementation, but does not provide specific version numbers for any of these, nor for any programming languages or deep learning frameworks used (e.g., Python, PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | Our main model, Func Mol, follows the auto-encoding approach described in Section 3.2. The codes z are computed with an encoder that takes as input a low-resolution voxelized representation of the molecular field with grid dimension of 16 16 16. ... We consider modulation codes with dimension 1024 on QM9 and 2048 on GEOM-drugs and CREMP. We use the same neural field network for all datasets: a conditional MFN with Gabor filters and 6 Fi LM-modulated layers, where each fully-connected layer has 2048 hidden units. We augment the training set by applying random rotations on the three Euler angles. The weights of the latent code encoder and neural field decoder are trained jointly. ... We choose a noise level in normalized space of σ = 1.2 for GEOM-drugs and CREMP, σ = 2.0 for QM9. Our code denoiser is a modified version of the denoiser used in [36]: a fully-connected network with 18 residual blocks (each with two linear layers with 6144 hidden units) and skip connections. ... We initialize the MCMC chains with noise and use the following sampling hyperparameters γ = 1.0 and δ = σ/2 as in [5, 78]. For evaluation purposes, we generate one sample per chain. We consider 1000 steps for QM9 and GEOM-drugs and 10000 steps for CREMP. |