Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations
Authors: Keir Adams, Lagnajit Pattanaik, Connor W. Coley
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our model on four benchmarks: contrastive learning to distinguish conformers of different stereoisomers in a learned latent space, classification of chiral centers as R/S, prediction of how enantiomers rotate circularly polarized light, and ranking enantiomers by their docking scores in an enantiosensitive protein pocket. We compare our model, Chiral Inter Roto-Invariant Neural Network (Ch IRo), with 2D and 3D GNNs to demonstrate that our model achieves state of the art performance when learning chiral-sensitive functions from molecular structures. |
| Researcher Affiliation | Academia | Keir Adams1, Lagnajit Pattanaik1, Connor W. Coley1,2 1Department of Chemical Engineering 2Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology, Cambridge, MA 02139, USA {keir,lagnajit,ccoley}@mit.edu |
| Pseudocode | No | The paper describes the model architecture and processes in detailed text and figures (e.g., Figure 4), but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/keiradams/Ch IRo. [...] we make source code with experimental setups, model implementations, and random seeds available at https://github. com/keiradams/Ch IRo. |
| Open Datasets | Yes | We use a subset of the Pub Chem3D dataset, which consists of multiple OMEGA-generated conformations of organic molecules with up to 50 heavy atoms and 15 rotatable bonds (Bolton et al., 2011; Hawkins et al., 2010). [...] We make our datasets for the contrastive learning, R/S classification, and docking tasks available to the public. [...] Our Git Hub site also contains links to the exact datasets and splits used in each experiment for the contrastive learning, R/S classification, and docking tasks. Although copyright restrictions prevent us from releasing the dataset and splits for the l / d classification task, we detail our data filtering and processing steps in appendix A.7. |
| Dataset Splits | Yes | We create 70/15/15 training/validation/test splits, keeping conformers corresponding to the same 2D graphs in the same data partition. [...] The full contrastive learning dataset was split into 70/15/15 sets for training, validation, and testing, respectively [...] The full R/S dataset was similarly separated into 70/15/15 sets [...] We split this dataset into 5 folds for cross-validation, randomly assigning each pair of enantiomers (with their conformers) to a test set in one of the five folds. For each fold, we randomly split the remaining (i.e., non-testing) pairs of enantiomers into 87.5/12.5 training/validation sets. Note that this ensures each fold has 70/10/20 training/validation/test splits [...] We split the full dataset into 70/15/15 training/validation/test sets, assigning pairs of enantiomers (with their conformers) to the same data partition. |
| Hardware Specification | No | The authors acknowledge the MIT Super Cloud and Lincoln Laboratory Supercomputing Center for providing HPC resources that have contributed to the research results reported within this paper. However, no specific details about the type of hardware (e.g., CPU, GPU models, memory) are provided. |
| Software Dependencies | No | We implement our network with Pytorch Geometric (Fey & Lenssen, 2019). [...] using the ETKDG algorithm in RDKit (Landrum, 2010). [...] using the Raytune (Liaw et al., 2018) Python package with the Hyper Opt plug-in. [...] which employs the Optuna hyperoptimization framework (Akiba et al., 2019). While several software tools and libraries are mentioned, specific version numbers for these dependencies (e.g., PyTorch Geometric version, RDKit version, Python version) are not provided. |
| Experiment Setup | Yes | This section describes the full training protocols for each task considered in this paper. [...] Table 5 specifies the hyperparameters chosen for each task. See appendix A.6 for details on hyperparameter optimizations. [...] Hyperparameters were tuned for Ch IRo on the R / S, l / d, and ranking enantiomers by docking score tasks, using the Raytune (Liaw et al., 2018) Python package with the Hyper Opt plug-in. [...] Tables 6, 7 and 8 list the hyperparameters used for Sphere Net, Sch Net and Dime Net++ on the contrastive learning and R / S classification tasks. |