reproducibilityindex.ai

Generative Modeling of Molecular Dynamics Trajectories

Authors: Bowen Jing, Hannes Stärk, Tommi Jaakkola, Bonnie Berger

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the full set of these capabilities on tetrapeptide simulations and show preliminary results on scaling to protein monomers. Altogether, our work illustrates how generative modeling can unlock value from MD data towards diverse downstream tasks that are not straightforward to address with existing methods or even MD itself. We evaluate MDGEN on the forward simulation, interpolation, upsampling, and inpainting tasks on tetrapeptides in a transferable setting (i.e., unseen test peptides).
Researcher Affiliation	Academia	Bowen Jing 1 Hannes Stärk 1 Tommi Jaakkola1 Bonnie Berger1 2 1CSAIL, Massachusetts Institute of Technology 2Dept. of Mathematics, Massachusetts Institute of Technology {bjing, hstark}@mit.edu, tommi@csail.mit.edu, bab@mit.edu
Pseudocode	Yes	Algorithm 1: Velocity network, Algorithm 2: Diffusion Transformer Attention Layer, Algorithm 3: Invariant Point Attention Layer are provided in Appendix A.1.
Open Source Code	Yes	Code is available at https://github.com/bjing2016/mdgen.
Open Datasets	Yes	For proteins, we use explicit-solvent, all-atom simulations from the ATLAS dataset (Vander Meersche et al., 2024). To obtain tetrapeptide MD trajectories for training and evaluation, we run implicit- and explicit-solvent, all-atom simulations of 3000 training, 100 validation, and 100 test tetrapeptides for 100 ns.
Dataset Splits	Yes	For explicit-solvent settings (forward simulation, interpolation, inpainting), we run simulations for 3109 training, 100 validation, and 100 test peptides. For implicit-solvent settings (upsampling), we run simulations for 2646 training, 100 validation, and 100 test peptides.
Hardware Specification	Yes	MD runtimes in Table 2 are tabulated on a NVIDIA T4 GPU. All MDGEN experiments are carried out on NVIDIA A6000 GPUs. Alpha Flow and MSA subsampling runtimes in Table 4 are tabulated on NVIDIA A100 GPUs by Jing et al. (2024).
Software Dependencies	No	The paper mentions software like 'Open MM (Eastman et al., 2017)' and 'Py EMMA (Scherer et al., 2015; Wehmeyer et al.)' and 'amber14 force field parameters', but it does not specify explicit version numbers for these software packages or libraries used in their experimental setup within the main text or appendices.
Experiment Setup	Yes	Unless otherwise specified, models are trained with trajectory timesteps of t = 10 ps. All simulations are integrated with Langevin thermostat at 350K with hydrogen bond constraints, timestep 2 fs, and friction coefficient 0.3 ps 1 (explicit) or 0.1 ps 1 (implicit).