Symphony: Symmetry-Equivariant Point-Centered Spherical Harmonics for 3D Molecule Generation

Authors: Ameya Daigavane, Song Eun Kim, Mario Geiger, Tess Smidt

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that Symphony is able to accurately generate small molecules from the QM9 dataset, outperforming existing autoregressive models and approaching the performance of diffusion models. ... To test our proposed architecture, we apply Symphony to the QM9 dataset and show that it outperforms previous autoregressive models and is competitive with existing diffusion models on a variety of metrics. ... 4 EXPERIMENTAL RESULTS ... From Table 3 we see that Symphony and other autoregressive models struggle to match the bond length distribution of QM9 as well as EDM.
Researcher Affiliation Collaboration Ameya Daigavane1, Song Kim1, Mario Geiger2 , Tess Smidt1 {ameyad,songk}@mit.edu, geiger.mario@gmail.com, tsmidt@mit.edu 1Massachusetts Institute of Technology 2NVIDIA
Pseudocode Yes Algorithm 1 CREATEFRAGMENTSEQUENCE ... Algorithm 2 General Operation of a Message Passing Neural Network
Open Source Code Yes Our JAX code containing all of the data preprocessing, model training and evaluation metrics is available at https://github.com/atomicarchitects/symphony.
Open Datasets Yes Following EDM (Hoogeboom et al., 2022), we obtained the QM9 (Rupp et al., 2012) dataset using the Deep Chem library (Ramsundar et al., 2019), and filtered out 3054 uncharacterized molecules (available at https://springernature.figshare.com/ndownloader/files/3195404) which rearranged significantly during geometry optimization, giving us exactly 130831 molecules.
Dataset Splits Yes Symphony was trained used the same splits as EDM: 100000 molecules to train, 13083 molecules for validation and 17748 molecules for test, obtained from a random permutation of the molecules.
Hardware Specification Yes As measured on a single NVIDIA RTX A5000 GPU, Symphony s inference speed is 0.293 seconds/molecule, compared to EDM s 0.930 sec/mol.
Software Dependencies No The paper mentions using 'e3nn-jax library that utilizes the JAX (Bradbury et al., 2018) framework' and 'RDKit (Landrum et al., 2023)' but does not specify version numbers for these software components.
Experiment Setup Yes We set σ2 true = 10 5 and express the Dirac delta distribution in the spherical harmonic basis upto lmax = 5, as explained in Appendix H. ... All parameters in the EMBEDDER, MLP and LINEAR layers are trained with the Adam (Kingma & Ba, 2017) optimizer with a learning rate of 5 10 4. We chose the parameters that achieved the lowest loss on the validation set over 8000000 training steps with a batch size of 16 fragments.