reproducibilityindex.ai

Efficient Generative Modelling of Protein Structure Fragments using a Deep Markov Model

Authors: Christian B Thygesen, Christian Skjødt Steenmans, Ahmad Salim Al-Sibahi, Lys Sanz Moreta, Anders Bundgård Sørensen, Thomas Hamelryck

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	BIFROST was trained on a data set of fragments derived from a set of 3733 proteins from the cullpdb data set (Wang & Dunbrack, 2005). Prior to training, the data was randomly split into train, test, and validation sets with a 60/20/20% ratio. BIFROST was benchmarked against Rosetta’s fragment picker (Gront et al., 2011) using the precision and coverage metrics.
Researcher Affiliation	Collaboration	1Department of Computer Science, University of Copenhagen, Copenhagen, Denmark 2Evaxion Biotech, Copenhagen, Denmark 3Department of Biology, University of Copenhagen, Copenhagen, Denmark.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include an unambiguous statement about releasing open-source code or provide a direct link to a code repository for the methodology described.
Open Datasets	Yes	BIFROST was trained on a data set of fragments derived from a set of 3733 proteins from the cullpdb data set (Wang & Dunbrack, 2005).
Dataset Splits	Yes	Prior to training, the data was randomly split into train, test, and validation sets with a 60/20/20% ratio.
Hardware Specification	Yes	Training and testing were carried out on a machine equipped with an Intel Xeon CPU E5-2630 and Tesla M10 GPU.
Software Dependencies	Yes	The presented model was implemented in the deep probabilistic programming language Pyro, version 1.3.0 (Bingham et al., 2019) and Pytorch version 1.4.0 (Paszke et al., 2019).
Experiment Setup	Yes	The ﬁnal model was trained with a learning rate of 0.0003 with a scheduler reducing the learning rate by 90% when no improvement was seen for 10 epochs. Minibatch size was 200. The Adam optimiser was used with a β1 and β2 of 0.96 and 0.999 respectively. The latent space dimensionality was 40. All hidden activations (if not speciﬁed above) were Re LU activations. We employed norm scaling of the gradient to a norm of 10.0. Finally, early stopping was employed with a patience of 50 epochs.