reproducibilityindex.ai

Foundation Inference Models for Markov Jump Processes

Authors: David Berghaus, Kostadin Cvejoski, Patrick Seifner, César Ali Ojeda Marin, Ramsés J. Sánchez

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate that one and the same (pretrained) recognition model can infer, in a zero-shot fashion, hidden MJPs evolving in state spaces of different dimensionalities. Specifically, we infer MJPs which describe (i) discrete flashing ratchet systems, which are a type of Brownian motors, and the conformational dynamics in (ii) molecular simulations, (iii) experimental ion channel data and (iv) simple protein folding models.
Researcher Affiliation	Collaboration	Lamarr Institute1, Fraunhofer IAIS2, University of Bonn3 & University of Potsdam4
Pseudocode	Yes	Algorithm 1 Gillespie Algorithm for Markov Jump Processes
Open Source Code	Yes	Our pretrained model, repository and tutorials are available online1. https://fim4science.github.io/Open FIM/intro.html
Open Datasets	Yes	Our FIM was (pre)trained on a dataset of 45K MJPs, defined over state spaces whose sizes range from 2 to 6.
Dataset Splits	No	The paper mentions 'early stopping' which implies the use of a validation set, and discusses 'training range' and 'evaluation set', but does not explicitly provide percentages or counts for training, validation, and test splits for the synthetic dataset.
Hardware Specification	Yes	All models were trained on two A100 80Gb GPUs for approximately 500 epochs or approximately 2.5 days on average per model.
Software Dependencies	No	The paper mentions 'Adam W' as the optimizer but does not provide specific version numbers for software libraries such as PyTorch, TensorFlow, or Python, which are necessary for reproducible ancillary software description.
Experiment Setup	Yes	Hyperparameters were tuned using a grid search method. The optimizer utilized was Adam W (Loshchilov and Hutter, 2017), with a learning rate and weight decay both set at 1e 4. A batch size of 128 was used. During the grid search, we experimented with the hidden size of the path encoder ([64, 128, 256, 512]), the hidden size of the path attention network ([128, 256]), and various MLP architectures for ϕ1, ϕ2, and ϕ3 ([[32, 32], [128, 128]]). Early stopping was employed as the stopping criterion.