Reconstructing continuous distributions of 3D protein structure from cryo-EM images

Authors: Ellen D. Zhong, Tristan Bepler, Joseph H. Davis, Bonnie Berger

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that the proposed method, termed cryo DRGN, can perform ab initio reconstruction of 3D protein complexes from simulated and real 2D cryo-EM image data. We present results on both real and simulated cryo-EM data.
Researcher Affiliation Academia Ellen D. Zhong MIT zhonge@mit.edu Tristan Bepler MIT tbepler@mit.edu Joseph H. Davis MIT jhdavis@mit.edu Bonnie Berger MIT bab@mit.edu
Pseudocode Yes Algorithm 1 Cryo DRGN branch and bound with frequency marching
Open Source Code No The paper does not include an explicit statement about releasing source code or provide any links to a code repository.
Open Datasets Yes We create two synthetic datasets following the cryo-EM image formation model (image size D=128, 50k projections, with and without noise), and use one real dataset from EMPIAR-10028 consisting of 105,247 images of the 80S ribosome downsampled to image size D=90. ... We used the dataset from EMPIAR-10076 which contains 131,899 images of the E. coli large ribosomal subunit (LSU) in various stages of assembly (Davis et al. (2016)).
Dataset Splits No The paper mentions training in minibatches and epochs and validates results against existing tools, but it does not specify a distinct validation dataset split (e.g., percentages, sample counts, or named splits) separate from training or testing.
Hardware Specification Yes Training times are reported for 50k, D=128 image datasets trained on a Nvidia Titan V GPU.
Software Dependencies No The paper mentions 'Adam optimizer (Kingma & Ba (2014))' and 'Pytorch (Paszke et al. (2017))' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Unless otherwise specified, the encoder and decoder networks are both MLPs containing 10 hidden layers of dimension 128 with Re LU activations. ... We use the Adam optimizer (Kingma & Ba (2014)) with learning rate of 5e-4 for experiments involving noiseless, homogeneous datasets, and 1e-4 for all other experiments. ... For each dataset, we train the volume decoder (10 hidden layers of dimension 128) in minibatches of 10 images with random orientations for the first epoch... followed by 4 epochs with branch and bound (BNB) pose inference... train a separate, larger volume decoder (10 hidden layers of dimension 500) for 15 epochs with fixed poses...