reproducibilityindex.ai

Chemically Transferable Generative Backmapping of Coarse-Grained Proteins

Authors: Soojung Yang, Rafael Gomez-Bombarelli

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, we perform ablation studies on the model architecture and loss functions, and compare our model with the baseline, CGVAE. For each experiment, we perform five random seed experiments and report the mean and variance of the metrics. Table 1. Ablation study on the model architecture.
Researcher Affiliation	Academia	1Computational and Systems Biology, MIT, Cambridge, MA, United States 2Department of Material Science and Engineering, MIT, Cambridge, MA, United States.
Pseudocode	Yes	Algorithm 1 A pseudocode for the reconstruction of the list of Cartesian coordinates of side chain atoms, L, for a residue with m side chain atoms.
Open Source Code	Yes	Code and dataset for training and inference are available at https://github.com/learningmatter-mit/Gen ZProt.
Open Datasets	Yes	Our training and test data are from the protein structural ensemble database PED (Lazar et al., 2021).
Dataset Splits	Yes	We split the train and test set by protein entries (i.e., models never see the test protein entries during training). The validation set is identical to the test set, and the learning rate reduction and early stopping are controlled based on the validation loss. From 227 total entries of PED, we use 84 entries for training, four entries for validation, and four entries for testing.
Hardware Specification	Yes	Models were trained with Xeon-G6 GPU nodes until convergence, with a maximum runtime of 20 hours.
Software Dependencies	No	The paper mentions software like 'e3nn library' and 'PyTorch nn.Embedding' but does not specify their version numbers.
Experiment Setup	Yes	Table 8. A list of hyperparameters. Node-wise latent variable dimension 36, Atom neighbor cutoff [ Å] 9.0, Residue neighbor cutoff [ Å] 21.0, Encoder convolution depth 3, Decoder convolution depth 4, Maximum training hours [hr] 20, Batch size 4, Learning rate 1e-3, β coefficient for KL divergence 0.05, γ coefficient for Llocal 1.0, δ coefficient for Ltorsion 1.0, η coefficient for Lxyz 1.0, ζ coefficient for Lsteric 3.0.