Energy-based models for atomic-resolution protein conformations

Authors: Yilun Du, Joshua Meier, Jerry Ma, Rob Fergus, Alexander Rives

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate the model, we benchmark on the rotamer recovery task, the problem of predicting the conformation of a side chain from its context within a protein structure, which has been used to evaluate energy functions for protein design. The model achieves performance close to that of the Rosetta energy function, a state-of-the-art method widely used in protein structure prediction and design. Models were trained for 180 thousand parameter updates using 32 NVIDIA V100 GPUs, a batch size of 16,384, and the Adam optimizer (α = 2 10 4, β1 = 0.99, β2 = 0.999). We evaluated training progress using a held-out 5% subset of the training data as a validation set.
Researcher Affiliation Collaboration Yilun Du Massachusetts Institute of Technology Cambridge, MA yilundu@mit.edu Joshua Meier Facebook AI Research New York, NY jmeier@fb.com Jerry Ma Facebook AI Research Menlo Park, CA maj@fb.com Rob Fergus Facebook AI Research & New York University New York, NY robfergus@fb.com Alexander Rives New York University New York, NY arives@cs.nyu.edu
Pseudocode Yes Algorithm 1 Training Procedure for the EBM
Open Source Code Yes Data and code for experiments are available at https://github.com/facebookresearch/ protein-ebm
Open Datasets Yes We constructed a curated dataset of high-resolution PDB structures using the Cull PDB database, with the following criteria: resolution finer than 1.8 A; sequence identity less than 90%; and R value less than 0.25 as defined in Wang & R. L. Dunbrack (2003). To test the model on rotamer recovery, we use the test set of structures from Leaver-Fay et al. (2013).
Dataset Splits Yes We evaluated training progress using a held-out 5% subset of the training data as a validation set.
Hardware Specification Yes Models were trained for 180 thousand parameter updates using 32 NVIDIA V100 GPUs, a batch size of 16,384, and the Adam optimizer (α = 2 10 4, β1 = 0.99, β2 = 0.999).
Software Dependencies No The paper mentions using the 'Adam optimizer' but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes Models were trained for 180 thousand parameter updates using 32 NVIDIA V100 GPUs, a batch size of 16,384, and the Adam optimizer (α = 2 10 4, β1 = 0.99, β2 = 0.999). For all experiments, we use a 6-layer Transformer with embedding dimension of 256 (split over 8 attention heads) and feed-forward dimension of 1024. The final MLP contains 256 hidden units. The models are trained without dropout. Layer normalization (Ba et al., 2016) is applied before the attention blocks.