Learning from Protein Structure with Geometric Vector Perceptrons

Authors: Bowen Jing, Stephan Eismann, Patricia Suriana, Raphael John Lamarre Townshend, Ron Dror

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our approach on two important problems in learning from protein structure: model quality assessment and computational protein design. Our approach improves over existing classes of architectures on both problems, including state-of-the-art convolutional neural networks and graph neural networks. We release our code at https://github.com/drorlab/gvp.
Researcher Affiliation Academia Bowen Jing , Stephan Eismann , Patricia Suriana, Raphael J.L. Townshend, Ron O. Dror Stanford University {bjing, seismann, psuriana, raphael, rondror}@cs.stanford.edu
Pseudocode Yes Algorithm 1 Geometric vector perceptron
Open Source Code Yes We release our code at https://github.com/drorlab/gvp.
Open Datasets Yes We use the CATH 4.2 dataset curated by Ingraham et al. (2019) in which all available structures with 40% nonredudancy are partitioned by their CATH (class, architecture, topology/fold, homologous superfamily) classification. The training, validation, and test splits consist of 18204, 608, and 1120 structures, respectively. We train and validate on 79200 candidate structures for 528 targets submitted to CASP 5-10.
Dataset Splits Yes The training, validation, and test splits consist of 18204, 608, and 1120 structures, respectively. The MQA training and validation dataset includes 528 targets from CASP 5-10 and 150 candidate structures per target. These targets are partitioned at random into 480 training targets and 48 validation targets.
Hardware Specification Yes This takes around two days for both models on a single Titan X GPU.
Software Dependencies Yes All models are implemented in Tensor Flow 2.1
Experiment Setup Yes For both the MQA and CPD model, we use node and hidden embeddings with 16 vector and 100 scalar channels and edge embeddings with 1 vector and 32 scalar channels. In all training runs, we use the Adam optimizer to perform mini-batch gradient descent. Batches are constructed by grouping structures of similar size to have a maximum of 1800 residues per batch for CPD and 3000 residues per batch for MQA. All models are implemented in Tensor Flow 2.1 and trained for a maximum of 100 epochs. We also tune the following hyperparameters over a total of 70 training runs: Learning rate in the range of 10 4 to 10 3; Dropout probability in the range of 10 4 to 10 1; Number of graph propagation layers in the range of 3 to 6; Relative weight of the MQA pairwise loss in the range of 0 to 2