Chemically Transferable Generative Backmapping of Coarse-Grained Proteins
Authors: Soojung Yang, Rafael Gomez-Bombarelli
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we perform ablation studies on the model architecture and loss functions, and compare our model with the baseline, CGVAE. For each experiment, we perform five random seed experiments and report the mean and variance of the metrics. Table 1. Ablation study on the model architecture. |
| Researcher Affiliation | Academia | 1Computational and Systems Biology, MIT, Cambridge, MA, United States 2Department of Material Science and Engineering, MIT, Cambridge, MA, United States. |
| Pseudocode | Yes | Algorithm 1 A pseudocode for the reconstruction of the list of Cartesian coordinates of side chain atoms, L, for a residue with m side chain atoms. |
| Open Source Code | Yes | Code and dataset for training and inference are available at https://github.com/learningmatter-mit/Gen ZProt. |
| Open Datasets | Yes | Our training and test data are from the protein structural ensemble database PED (Lazar et al., 2021). |
| Dataset Splits | Yes | We split the train and test set by protein entries (i.e., models never see the test protein entries during training). The validation set is identical to the test set, and the learning rate reduction and early stopping are controlled based on the validation loss. From 227 total entries of PED, we use 84 entries for training, four entries for validation, and four entries for testing. |
| Hardware Specification | Yes | Models were trained with Xeon-G6 GPU nodes until convergence, with a maximum runtime of 20 hours. |
| Software Dependencies | No | The paper mentions software like 'e3nn library' and 'PyTorch nn.Embedding' but does not specify their version numbers. |
| Experiment Setup | Yes | Table 8. A list of hyperparameters. Node-wise latent variable dimension 36, Atom neighbor cutoff [ Å] 9.0, Residue neighbor cutoff [ Å] 21.0, Encoder convolution depth 3, Decoder convolution depth 4, Maximum training hours [hr] 20, Batch size 4, Learning rate 1e-3, β coefficient for KL divergence 0.05, γ coefficient for Llocal 1.0, δ coefficient for Ltorsion 1.0, η coefficient for Lxyz 1.0, ζ coefficient for Lsteric 3.0. |