Foundation Inference Models for Markov Jump Processes
Authors: David Berghaus, Kostadin Cvejoski, Patrick Seifner, César Ali Ojeda Marin, Ramsés J. Sánchez
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate that one and the same (pretrained) recognition model can infer, in a zero-shot fashion, hidden MJPs evolving in state spaces of different dimensionalities. Specifically, we infer MJPs which describe (i) discrete flashing ratchet systems, which are a type of Brownian motors, and the conformational dynamics in (ii) molecular simulations, (iii) experimental ion channel data and (iv) simple protein folding models. |
| Researcher Affiliation | Collaboration | Lamarr Institute1, Fraunhofer IAIS2, University of Bonn3 & University of Potsdam4 |
| Pseudocode | Yes | Algorithm 1 Gillespie Algorithm for Markov Jump Processes |
| Open Source Code | Yes | Our pretrained model, repository and tutorials are available online1. https://fim4science.github.io/Open FIM/intro.html |
| Open Datasets | Yes | Our FIM was (pre)trained on a dataset of 45K MJPs, defined over state spaces whose sizes range from 2 to 6. |
| Dataset Splits | No | The paper mentions 'early stopping' which implies the use of a validation set, and discusses 'training range' and 'evaluation set', but does not explicitly provide percentages or counts for training, validation, and test splits for the synthetic dataset. |
| Hardware Specification | Yes | All models were trained on two A100 80Gb GPUs for approximately 500 epochs or approximately 2.5 days on average per model. |
| Software Dependencies | No | The paper mentions 'Adam W' as the optimizer but does not provide specific version numbers for software libraries such as PyTorch, TensorFlow, or Python, which are necessary for reproducible ancillary software description. |
| Experiment Setup | Yes | Hyperparameters were tuned using a grid search method. The optimizer utilized was Adam W (Loshchilov and Hutter, 2017), with a learning rate and weight decay both set at 1e 4. A batch size of 128 was used. During the grid search, we experimented with the hidden size of the path encoder ([64, 128, 256, 512]), the hidden size of the path attention network ([128, 256]), and various MLP architectures for ϕ1, ϕ2, and ϕ3 ([[32, 32], [128, 128]]). Early stopping was employed as the stopping criterion. |